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Preface 


This volume collects a selection of refereed papers of the more than one hundred 
presented at the International Conference MAF 2008 — Mathematical and Statistical 
Methods for Actuarial Sciences and Finance. 

The conference was organised by the Department of Applied Mathematics and 
the Department of Statistics of the University Ca’ Foscari Venice (Italy), with the col- 
laboration of the Department of Economics and Statistical Sciences of the University 
of Salerno (Italy). It was held in Venice, from March 26 to 28, 2008, at the prestigious 
Cavalli Franchetti palace, along Grand Canal, of the Istituto Veneto di Scienze, Lettere 
ed Arti. 

This conference was the first international edition of a biennial national series 
begun in 2004, which was born of the brilliant belief of the colleagues — and friends — 
of the Department of Economics and Statistical Sciences of the University of Salerno: 
the idea following which the cooperation between mathematicians and statisticians 
in working in actuarial sciences, in insurance and in finance can improve research on 
these topics. The proof of this consists in the wide participation in these events. In 
particular, with reference to the 2008 international edition: 


— More than 150 attendants, both academicians and practitioners; 

— More than 100 accepted communications, organised in 26 parallel sessions, from 
authors coming from about twenty countries (namely: Canada, Colombia, Czech 
Republic, France, Germany, Great Britain, Greece, Hungary, Ireland, Israel, Italy, 
Japan, Poland, Spain, Sweden, Switzerland, Taiwan, USA); 

— two plenary guest-organised sessions; and 

— aprestigious keynote lecture delivered by Professor Wolfgang Hardle of the Hum- 
boldt University of Berlin (Germany). 


The papers published in this volume cover a wide variety of subjects: actuarial mod- 
els; ARCH and GARCH modelling; artificial neural networks in finance; copule; 
corporate finance; demographic risk; energy markets; insurance and reinsurance; 
interest rate risk; longevity risk; Monte Carlo approaches; mutual fund analysis; 
non-parametric testing; option pricing models; ordinal models; probability distribu- 
tions and stochastic processes in finance; risk measures; robust estimation in finance; 
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solvency analysis; static and dynamic portfolio management; time series analysis; 
volatility term structure; and trading systems. 

Of course, the favourable outcome of this conference would not have been possible 
without the precious help of our sponsors (in alphabetical order): Banca d’Italia; 
Casino Municipale di Venezia; Cassa di Risparmio di Venezia; Istituto Veneto di 
Scienze, Lettere ed Arti; Provincia di Venezia; and VENIS — Venezia Informatica e 
Sistemi. We truly thank them all. 

Moreover, we also express our gratitude to the members of the Scientific and the 
Organising Committees, and to all the people whose collaboration contributed to the 
success of the MAF 2008 conference. 

Finally, we would like to report that the organization of the next conference 
has already begun: the MAF 2010 conference will be held in Ravello (Italy), on 
the Amalfitan Coast, from April 7 to 9, 2010 (for more details visit the website 
http://maf2010.unisa.it/). We anticipate your attendance. 


Venezia, August 2009 Marco Corazza and Claudio Pizzi 
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Impact of interest rate risk on the Spanish 
banking sector 


Laura Ballester, Roman Ferrer, and Cristobal Gonzalez 


Abstract. This paper examines the exposure of the Spanish banking sector to interest rate 
risk. With that aim, a univariate GARCH-M model, which takes into account not only the 
impact of interest rate changes but also the effect of their volatility on the distribution of bank 
stock returns, is used. The results show that both changes and volatility of interest rates have a 
negative and significant impact on the stock returns of the Spanish banking industry. Moreover, 
there seems to be a direct relationship between the size of banking firms and their degree of 
interest rate sensitivity. 


Key words: interest rate risk, banking firms, stocks, volatility 


1 Introduction 


Interest rate risk (IRR) is one of the key forms of financial risk faced by banks. This 
risk stems from their role as financial intermediaries and it has been attributed to two 
major reasons. First, in their balance sheets, banks primarily hold financial assets and 
liabilities contracted in nominal terms. Second, banks traditionally perform a matu- 
rity transformation function using short-term deposits to finance long-term loans. The 
resulting maturity mismatch introduces volatility into banks’ income and net worth 
as interest rates change, and this is often seen as the main source of bank IRR. In 
recent years, IRR management has gained prominence in the banking sector due to 
the fact that interest rates have become substantially more volatile and the increasing 
concern about this topic under the new Basel Capital Accord (Basel II). The most 
common approach to measuring bank interest rate exposure has consisted of estimat- 
ing the sensitivity of bank stock returns to interest rate fluctuations. The knowledge 
of the effect of interest rate variability on bank stocks is important for bank managers 
to adequately manage IRR, investors for hedging and asset allocation purposes, and 
banking supervisors to guarantee the stability of the banking system. The main ob- 
jective of this paper is to investigate the interest rate exposure of the Spanish banking 
industry at a portfolio level by using the GARCH (generalised autoregressive condi- 
tional heteroskedasticity) methodology. Its major contribution is to examine for the 
first time in the Spanish case the joint impact of interest rate changes and interest rate 
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volatility on the distribution of bank stock returns. The rest of the paper is organised 
as follows. Section 2 contains a review of the relevant literature. The methodology 
employed and data used are described in Sections 3 and 4, respectively. Section 5 
reports the empirical results. Finally, Section 6 concludes. 


2 Literature review 


The influence of IRR on bank stocks is an issue addressed by a considerable amount 
of literature. The bulk of the research has focused on the two-index model postulated 
by [18] and several general findings can be stated. First, most of the papers document 
a significant negative relationship between interest rate movements and bank stock 
returns. This result has been mainly attributed to the typical maturity mismatch be- 
tween banks’ assets and liabilities. Banks are normally exposed to a positive duration 
gap because the average duration of their assets exceeds the average duration of their 
liabilities. Thus, the net interest income and the bank value are negatively affected 
by rising interest rates and vice versa. Second, bank stocks tend to be more sensitive 
to changes in long-term interest rates than to short-term rates. Third, interest rate 
exposure of banks has declined over recent years, probably due to the development 
of better systems for managing IRR. 

Early studies on bank IRR were based on standard regression techniques under 
the restrictive assumptions of linearity, independence and constant conditional vari- 
ance of stock returns (see, e.g., [1, 10]). Later on, several studies (see, e.g., [14, 15]) 
provided evidence against constant conditional variance. A more recent strand of 
literature attempts to capture the time-varying nature of the interest rate sensitivity 
of bank stock returns by using GARCH-type methodology. Specifically, [17] led the 
way in the application of ARCH methodology in banking, showing its suitability for 
bank stock analysis. Subsequent studies have used different types of GARCH pro- 
cesses to examine interest rate exposure of banks. For example, [5] and [16] have 
employed univariate GARCH-M (GARCH in mean) models to examine both the ef- 
fect of changes in interest rates and their volatility on bank stock returns, whereas [6] 
and [9] have used multivariate GARCH-M models. 


3 Methodology 


The model proposed can be viewed as an extended version of a univariate 
GARCH(1,1)-M model similar to the formulations by [5] and [16]. It is as follows: 


Rit = @ +A: Rmi + GAL + yj log hit + Eit qd) 
hit = a0 + a1€2_) + Bhit-1 + GV Ch-1 (2) 
€it|Qy-1 ~ NO, hit) (3) 


where Rj; denotes the return on bank i’s stock in period t, Ry»; the return on the 
market portfolio in period t, AJ; the change in the interest rate in period f¢, €;; an 
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error term with zero mean and conditional variance h;;, which is dependent on the 
information set Q,—1, and V CJ;_ the interest rate volatility in period t — 1. Moreover, 
Mi, i, 9:3, Yi, @0, @1, B and 6d; are the parameters to be estimated. In particular, 1; 
describes the sensitivity of the return on ith bank stock to general market fluctuations 
and it can be seen as a measure of market risk. In turn, 6; reflects the sensitivity of 
the return on ith bank stock to movements in interest rates controlling for changes 
in the market return. Hence, it is a measure of ith bank IRR. As usual, to preserve 
the non-negativity requirement for the conditional variance ao, a1, 6 > 0, whereas 
a1+f <1 for stability to hold. 

The GARCH-M approach is consistent with the patterns of leptokurtosis and 
volatility clustering frequently observed in stock markets and allows for the consid- 
eration of time-varying risk premia and an empirical assessment of the relationship 
between risk and return. Some features of the model should be highlighted. First, it 
incorporates the conditional variance hj; as an additional explanatory variable in (1). 
The specification of volatility in logarithmic form is based on [7]. Second, the typical 
structure of GARCH processes has been extended in (2) by modelling the conditional 
variance as a function of the conditional interest rate volatility lagged in one period. In 
this respect, even though the effect of interest rate volatility on stock returns has been 
considered in the literature to a lesser extent than the impact of interest rate changes, 
the interest rate volatility is important enough to be taken into account. As [5] points 
out, this variable conveys critical information about the overall volatility of the finan- 
cial markets and it influences the volatility of bank stock returns also at the micro 
level. 

There are also several critical aspects regarding the model estimation. The first 
issue has to do with the possible multicolinearity between the series of market portfolio 
return and interest rate changes, which could generate serious estimation problems. 
Due to the significant negative correlation typically observed in the Spanish case 
between these two variables, an orthogonalisation procedure has been used. Since the 
central aim of this study is to analyse the banks’ IRR, the market portfolio return has 
been orthogonalised as in [10] or [11]. Thus, the residuals from an auxiliary regression 
of the market return series on a constant and the interest rate fluctuations series, by 
construction uncorrelated with the interest rate changes, have replaced the original 
market portfolio returns in (1). 

A second issue concerns the choice of the interest rate proxy to be used. In this 
sense, long-term interest rates are the proxy most employed in the literature, since 
they are assumed to exert great influence on corporate decisions and overall economic 
activity. Nevertheless, in order to enhance the robustness of the results, short-term 
interest rates and the spread between long- and short-term rates have been used as 
well. With regard to the short-term rates, an interbank rate has been chosen since the 
money market has become a key reference for Spanish banks during recent years. 
In turn, the interest rate spread is considered a good proxy for the slope of the yield 
curve. 
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4 Data 


The sample consists of all commercial banks listed on the Spanish Stock Exchange 
for at least one year over the period January 1993—December 2005 (23 banks in 
total). Monthly stock returns have been obtained from the Bolsa de Madrid database. 
The market portfolio used is a modified version of the Indice General de la Bolsa de 
Madrid (IGBM), the widest Spanish stock market index. Due to the major relevance 
of bank stocks in the IGBM, an alternative index where banks are excluded has been 
constructed in order to obtain a series of market returns as exogenous as possible. Mar- 
ket interest rates have been proxied by the monthly average yield on 10-year Spanish 
government bonds and the 3-month average rate of the Spanish interbank market, 
whereas the interest rate spread has been computed as the difference between them. 
Following [5] and [16], interest rate volatility has been measured by the conditional 
variance of interest rates, which is generated using a GARCH process. 

To check whether there is a relationship between bank size and IRR, bank stocks 
have been sorted by size into three portfolios — large (L), medium (M) and small (S) 
banks. This classification (see Table 1) is based on the three categories of commercial 
banks typically distinguished in the Spanish banking industry. Thus, the L portfolio 
includes the banks that have given rise to the two currently multinational Spanish 
banking conglomerates (B. Santander and BBVA). The M portfolio is made up of a 
group of medium-sized Spanish banks that operate in national territory. Finally, the S 
portfolio is constituted by a broad set of small banks that operate mostly at a regional 


Table 1. List of banks and composition of bank portfolios 


Portfolios Asset Volume (€ x 103) Obs. Portfolios Asset Volume (€ x 103) Obs. 
Portfolio L 

BSCH 396,124,995 81 B. Bilbao Vizcaya 100,026,979 85 

BBVA 297,433,664 71 Argentaria 69,998,972 80 

B. Santander 113,404,303 75 B. Central Hispano 68,793,146 75 

Portfolio M 

Banesto 42,332,585 156 Bankinter 15,656,910 156 
B. Exterior 32,130,967 51 B. Pastor 8,789,945 156 
B. Popular 29,548,620 156 B. Atlantico 7,591,378 138 
B. Sabadell 26,686,670 56 

Portfolio S 

B. Zaragozano 4,597,099 130 B. Galicia 1,726,563 156 
B. Valencia 4,213,420 156 B. de Vasconia 1,330,458 156 
B. Guipuzcoano 4,082,463 156 B. de Vitoria 875,974 62 

B. Andalucia 3,521,838 156 B. Crédito Balear 854,972 156 
B. Herrero 2,624,824 95 B. Alicante 835,576 64 

B. de Castilla 2,151,742 156 B. Simeén 686,451 67 


This table displays the list of Spanish commercial banks considered and their distribution in 
portfolios according to size criteria (portfolios L, M and S). 
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level.! The formation of portfolios has a twofold advantage. First, it is an efficient way 
of condensing substantial amounts of information. Second, it helps to smooth out the 
noise in the data due to shocks to individual stocks. On the contrary, portfolios can 
mask the dissimilarities among banks within each portfolio. In this case, the mentioned 
advantages seem to outweigh this inconvenience, according to the number of papers 
based on bank stock portfolios (see, e.g., [5,6, 17]). Monthly value-weighted portfolio 
returns have been obtained using year-end market capitalisation as the weight factor 
for each individual bank stock. 


5 Empirical results 


Table 2 contains the descriptive statistics of bank stock portfolio returns. They suggest 
that the data series are skewed and leptokurtic relative to the normal distribution. In 
addition, there is evidence of nonlinear dependence, possibly due to autoregressive 
heteroskedasticity. Overall, these diagnostics indicate that a GARCH-type process 
is appropriate for studying the IRR of bank stocks. Table 3 reports the parameters 
of the GARCH models estimated using the three alternative interest rate proxies.” 
The coefficient on the market return, 4;, is highly significant, positive and less than 
unity in all cases. Further, its absolute value increases as the portfolio size increases, 
indicating that market risk is directly related to bank size. This is a relevant and 
unambiguous result, because it is not affected by the weight of banks in the market 
index since they have been explicitly excluded from the market portfolio. The fact 
that 2; < 1 suggests that bank stock portfolios have less market risk than the overall 
stock market. 


Table 2. Descriptive statistics of bank portfolio stock returns 
Mean Variance Skewness_ Kurtosis JB Qd2) Q(24) Q2(12) Q2(24) 


L 0.016 0.006 —0.44*** 5.15***  35.41*** 9.63 12.55 49.59*** 61.6*** 
M 0.011 0.002 —0.002 5.34*** — 35.82*** 9.89 19.51 95.92*** 109.5*** 
S 0.013 0.001 2.20*** 13.42*** 833.6*** —29.28*** 35.63* 25.93** 28.35 


JB is the Jarque-Bera test statistic which tests the null hypothesis of normality of returns. Q(n) 
is the Ljung-Box statistic at a lag of n, which tests the presence of serial correlation. As usual 


“ee and * denote significance at the 1%, 5% and 10% levels, respectively. 


! The composition of bank portfolios is fixed for the whole sample period. Alternatively, we 
have also considered an annual restructuring of the portfolios according to their volume of 
total assets, and the results obtained in that case were very similar to those reported in this 
paper. 

? The final model to be estimated for portfolio S does not include the conditional variance of 
interest rates since its inclusion would generate serious problems in the estimation of the 
model due to the small variability of the returns on that portfolio. 
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Table 3. Maximum likelihood estimates of the GARCH-M extended model 


3-month interest rate changes 


Oj Ai Gj Vi a9 ay B 6j 
Portfolio L 
—0.01*** 0.96*** -1.12 0.004*** 0.0003*** 0.09*** 0.82*** —15.04*** 
(4.99) (17.60) (—0.09) (9.47) (10.22) (5.67) (54.94) (—8.56) 
Portfolio M 
02" O50" 1591). 0,002" 0.0004*** 0,15" -0.66""* 13.889" 
(11.04) (10.16) (—1.17) (8.87) (18.63) (4.74) (27.83) (—12.98) 
Portfolio S 
—0.15*** 0.27*** —1.31 —0.02*** 0.00009*** 0.03*** 0.89*** —- 
(—53.73) (5.12) (-1.17) (—58.56) (12.89) (6.33) (148.20) — 
10-year interest rate changes 
Oj Ai Gj Vi a9 ay B 6j 
Portfolio L 
0.03***  0.89*** —6.80*** 0.003*** 0.0004*** 0.14*** 0.79*** —45.34*** 
(9.08) (15.38) (—7.25) (6.41) (11.02) (6.65) (43.33) (—8.97) 
Portfolio M 
0.04***  0.48*** —3.19*** 0.005*** 0.0003*** 0.11*** 0.78*** —30.49*** 
(14.29) (8.82) (—3.04) (12.18) (19.48) (5.19) (48.76) (—10.36) 
Portfolio S 
—0.11*** = 0.25*** —3.28*** —0.01*** 0.00009*** 0.04*** 0.87*** — 
(—42.01) (4.26) (—3.37) (—46.50) (317) 16 80y SSSI) 
Interest rate spread 
Oj Ai Gj Vi a9 ay B 6j 
Portfolio L 
~ 0.13"** —0.95*** 0.32 0.018*** 0.0003*** 0.10** 0.80*** —10,00*** 
(3.24) (15.82) (—0.90) (3.10) (2.17) (2.08) (11.16) (—3.79) 
Portfolio M 
0.05*** = 0.51*** 0.03 0.007*** 0.0001*** 0.08*** 0.83*** —9.10*** 
(18.84) (9.71) (0.18) (16.78) (12.45) (5.39) (66.60) (—5.40) 
Portfolio S 
[Oe 006" 061" 0.03" - “Koods" 02 Tee 
(—66.93) (5.20) (—3.23) (—75.23) (12.67) (5.90) (155.75) — 


This table shows the maximum likelihood estimates of the GARCH(1,1)-M extended model 
for the different interest rate proxies based on equations (1)—(3). Values of f-statistics are in 
parentheses and *** ,** and * denote statistical significance at the 1%, 5% and 10% levels, 
respectively. 
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Concerning the impact of interest rate changes, 6; is always negative and statis- 
tically significant when long-term rates are used. Long-term rates exert the strongest 
influence on bank stock portfolio returns, consistent with previous research (see, 
e.g., [3,5,6]). 

The IRR of the Spanish banking industry also seems to be directly related to 
bank size. This finding may be attributed to three factors. First, the aggressive pricing 
policies — especially on the asset side — introduced by larger banks over recent years 
aimed to increase their market share in an environment of a sharp downward trend 
of interest rates and intense competition have led to an extraordinary increase of 
adjustable-rate products tied to interbank market rates. Second, the more extensive 
engagement of large banks in derivative positions. Third, large banks may have an 
incentive to assume higher risks induced by a moral hazard problem associated to their 
too big to fail status. As a result, the revenues and stock performance of bigger banks 
are now much more affected by market conditions. In contrast, more conservative 
pricing strategies of small banks, together with a minor use of derivatives and a 
heavier weight of idiosyncratic factors (e.g., rumours of mergers and acquisitions), 
can justify their lower exposure to IRR. 

To provide greater insight into the relative importance of both market risk and IRR 
for explaining the variability of bank portfolio returns, a complementary analysis has 
been performed. A two-factor model as in [18] is the starting point: 


Rit = ©; +AGRmt +4 AL + Ei (4) 


Since both explanatory variables are linearly independent by construction, the vari- 
ance of the return of each bank stock portfolio, Var(Rj;), can be written as: 


Var(Rit) = 22Var(Rmnt) + 6? Var(Ah) + Var(éir) (5) 


To compare both risk factors, equation (5) has been divided by Var(Rir ). Thus, the 
contribution of each individual factor can be computed as the ratio of its variance over 
the total variance of the bank portfolio return. As shown in Table 4, the market risk is 
indisputably the most important determinant of bank returns. IRR is comparatively 
less relevant, long-term rates being the ones which show greater incidence. 


Table 4. Relative importance of risk factors 
Interest rate changes 


3 months 10 years Spread 
Aly  Rmt Total Al,  Rmt Total Al,  Rmt Total 
Portfolio LL R2 (%) 0.85 53.84 54.69 2.81 51.77 54.58 1.22 53.47 54.69 
PortfolioM R2(%) 1.30 34.21 35.52 2.74 32.78 35.52 1.19 34.83 36.02 
Portfolio SR? (%) 1.24 15.19 1642 5.59 1240 17.99 1.08 15.35 16.43 


This table shows the contribution of interest rate and market risks, measured through the factor 
R? obtained from equation (5) in explaining the total variance of bank portfolio returns. 
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Turning to the mean equation of the GARCH-M model, the parameter y; has 
usually been interpreted as the compensation required to invest in risky assets by risk- 
averse investors. Since volatility as measured in GARCH models is not a measure 
of systematic risk, but total risk, y; does not necessarily have to be positive because 
increases of total risk do not always imply higher returns.° For our case, the estimated 
values for y; differ in sign across bank portfolios (positive for portfolios L and M and 
negative for portfolio S). This heterogeneity among banks may be basically derived 
from differences in product and client specialisation, interest rate hedging strategies, 
etc. The absence of a conclusive result concerning this parameter is in line with the 
lack of consensus found in prior research. In this sense, whereas [12] and [4] detected 
a positive relationship between risk and return (y; > 0), [5,9, 13] suggested a negative 
relationship (y; < 0). In turn, [2] and [16] found an insignificant y;. 

With regard to the conditional variance equation, a, and f are positive and signif- 
icant in the majority of cases. In addition, the volatility persistence (a1 + f/) is always 
less than unity, consistent with the stationarity conditions of the model. This implies 
that the traditional constant-variance capital asset pricing models are inappropriate 
for describing the distribution of bank stock returns in the Spanish case. 

The parameter 6;, which measures the effect of interest rate volatility on bank 
portfolio return volatility, is negative and significant for portfolios L and M.* A pos- 
sible explanation suggested by [5] is that, in response to an increase in interest rate 
volatility, L and M banks seek shelter from IRR and are able to reduce their exposure 
within one month, e.g., by holding derivatives and/or reducing the duration gap of 
their assets and liabilities. Hence, this generates a lower bank stock volatility in the 
following period. Moreover, a direct relationship seems to exist between the absolute 
value of 6;, the bank size and the term to maturity of interest rates. Thus, analogously 
to the previous evidence with interest rate changes, interest rate volatility has a greater 
negative effect on bank return volatility as the bank size increases. Further, interest 
rate volatility has a larger impact when long-term rates are considered. In sum, it 
can be concluded that the Spanish bank industry does show a significant interest rate 
exposure, especially with respect to long-term interest rates. 

In addition, the proposed GARCH model has been augmented with the purpose 
of checking whether the introduction of the euro as the single currency within the 
Monetary European Union from January 1, 1999 has significantly altered the de- 
gree of IRR of Spanish banks.> Thus, the following extended model has been esti- 


3 [13] indicates several reasons for the relationship between risk and return being negative. 
In the framework of the financial sector, [5] also suggests an explanation to get a negative 
trade-off coefficient between risk and return. 

4 Recall that this parameter does not appear in the model for portfolio S. 

5 Since the GARCH model estimation requires a considerable number of observations, a 
dummy variable procedure has been employed instead of estimating the model for each 
subperiod. 
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mated: 
Rit = @) + AGRmt + GAL + iD: AL, + yi loghit + Et (6) 
hit = a0 + a€7_) + Bhit-1 + VCL-1 (7) 
€j1|Q:-1 ~ NO, hit) (8) 


where D, = 1 if t < January 1999 and D, = Oif t > January 1999. Its associated 
coefficient, 7;, reflects the differential impact in terms of exposure to IRR during the 
pre-euro period. The results are reported in Table 5. 


Table 5. Maximum likelihood estimates of the GARCH-M extended model with dummy vari- 
able 


Portfolio L Portfolio M Portfolio S 


3 month 10 years Spread 3 months 10 years Spread 3 month 10 years Spread 
6 194 3.44 —-1.88*** 2.42 2.59 —0.73*** -—1.69 -0.67  —-0.17 
ni —6.52** —4.69** 1.52** —4,.57*** —6.53*** 0.95*** 0.13 —3.43*** —0.61** 


This table shows the IRR estimated parameters in the GARCH-M model following (6)-(8). 


“ee ** and * denote statistical significance at the 1%, 5% and 10% levels, respectively. 


The coefficient 74; is negative and significant at the usual levels in most cases 
with the long- and short-term interest rate changes, whereas the results are not totally 
conclusive with the spread series. This finding shows that the IRR is substantially 
higher during the pre-euro period, in line with prior evidence (see [15]) and indicating 
that interest rate sensitivity of bank stock returns has decreased considerably since 
the introduction of the euro. The declining bank IRR during the last decade can be 
basically attributed to the adoption of a more active role in asset-liability management 
by the banks in response to the increase of volatility in financial markets, which has 
led to more effective IRR management. 


6 Conclusions 


This paper examines the interest rate exposure of the Spanish banking sector within 
the framework of the GARCH-M. In particular, the analysis has been carried out on 
bank stock portfolios constructed according to size criteria. Following the most recent 
strand of research, this study investigates the impact of both interest rate changes and 
interest rate volatility on the distribution of bank stock returns. 

The results confirm the common perception that interest rate risk is a significant 
factor to explain the variability in bank stock returns but, as expected, it plays a sec- 
ondary role in comparison with market risk. Consistent with previous work, bank 
stock portfolio returns are negatively correlated with changes in interest rates, the 
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long-term rates being the ones which exert greater influence. This negative relation- 
ship has been mostly attributed to the typical maturity mismatch between banks’ 
assets and liabilities. Another explanation is closely linked to the expansion phase 
of the Spanish economy since the mid-1990s. Specifically, bank profits did increase 
dramatically, reaching their greatest figures ever, with the subsequent positive effect 
on stock prices, in a context of historically low interest rates within the framework 
of the Spanish housing boom. Further, interest rate volatility is also found to be a 
significant determinant of bank portfolio return volatility, with a negative effect. 

Another major result refers to the direct relationship found between bank size and 
interest rate sensitivity. This size-based divergence could be the result of differences 
between large and small banks in terms of bank pricing policy, extent of use of 
derivative instruments or product and client specialisation. Thus, larger banks have 
a stock performance basically driven by market conditions, whereas smaller banks 
are influenced more heavily by idiosyncratic risk factors. Finally, a decline of bank 
interest rate sensitivity during recent years has been documented, which may be linked 
to the greater availability of systems and instruments to manage and hedge interest 
rate risk. 
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Tracking error with minimum guarantee constraints 


Diana Barro and Elio Canestrelli 


Abstract. In recent years the popularity of indexing has greatly increased in financial markets 
and many different families of products have been introduced. Often these products also have 
a minimum guarantee in the form of a minimum rate of return at specified dates or a minimum 
level of wealth at the end of the horizon. Periods of declining stock market returns together 
with low interest rate levels on Treasury bonds make it more difficult to meet these liabilities. 
We formulate a dynamic asset allocation problem which takes into account the conflicting 
objectives of a minimum guaranteed return and of an upside capture of the risky asset returns. To 
combine these goals we formulate a double tracking error problem using asymmetric tracking 
error measures in the multistage stochastic programming framework. 


Key words: minimum guarantee, benchmark, tracking error, dynamic asset allocation, sce- 
nario 


1 Introduction 


The simultaneous presence of a benchmark and a minimum guaranteed return char- 
acterises many structured financial products. The objective is to attract potential in- 
vestors who express an interest in high stock market returns but also are not risk- 
seeking enough to fully accept the volatility of this investment and require a cushion. 
This problem is of interest also for the asset allocation choices for pension funds 
both in the case of defined benefits (which can be linked to the return of the funds) 
and defined contribution schemes in order to be able to attract members to the fund. 
Moreover, many life insurance products include an option on a minimum guaranteed 
return and a minimum amount can be guaranteed by a fund manager for credibil- 
ity reasons. Thus the proper choice of an asset allocation model is of interest not 
only for investment funds or insurance companies that offer products with investment 
components, but also for pension fund industry. 

In the literature there are contributions which discuss the two components sep- 
arately, and there are contributions which discuss the tracking error problem when 
a Value at Risk (VaR), Conditional Value at Risk (CVaR) or Maximum Drawdown 
(MD) constraint is introduced mainly in a static framework, but very few contributions 
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address the dynamic portfolio management problem when both a minimum guaran- 
tee and a tracking error objective are present; see for example [14]. To jointly model 
these goals we work in the stochastic programming framework as it has proved to 
be flexible enough to deal with many different issues which arise in the formulation 
and solution of these problems. We do not consider the point of view of an investor 
who wants to maximise the expected utility of his wealth along the planning horizon 
or at the end of the investment period. Instead we consider the point of view of a 
manager of a fund, thus representing a collection of investors, who is responsible for 
the management of a portfolio connected with financial products which offer not only 
a minimum guaranteed return but also an upside capture of the risky portfolio returns. 
His goals are thus conflicting since in order to maximise the upside capture he has 
to increase the total riskiness of the portfolio and this can result in a violation of the 
minimum return guarantee if the stock market experiences periods of declining re- 
turns or if the investment policy is not optimal. On the other hand a low risk profile on 
the investment choices can assure the achievement of the minimum return guarantee, 
if properly designed, but leaves no opportunity for upside capture. 


2 Minimum guaranteed return and constraints 
on the level of wealth 


The relevance of the introduction of minimum guaranteed return products has grown 
in recent years due to financial market instability and to the low level of interest rates 
on government (sovereign) and other bonds. This makes it more difficult to fix the 
level of the guarantee in order to attract potential investors. Moreover, this may create 
potential financial instability and defaults due to the high levels of guarantees fixed 
in the past for contracts with long maturities, as the life insurance or pension fund 
contracts. See, for example, [8,20,31]. 

A range of guarantee features can be devised such as rate-of-return guarantee, 
including the principal guarantee, i.e., with a zero rate of return, minimum benefit 
guarantee and real principal guarantee. Some of them are more interesting for par- 
ticipants in pension funds while others are more relevant for life insurance products 
or mutual funds. In the case of minimum return guarantee, we ensure a deterministic 
positive rate of return (given the admissibility constraints for the attainable rate of 
returns); in the minimum benefit a minimum level of payments are guaranteed, at re- 
tirement date, for example. In the presence of nominal guarantee, a fixed percentage 
of the initial wealth is usually guaranteed for a specified date while real or flexible 
guarantees are usually connected to an inflation index or a capital market index. 

The guarantee constraints can be chosen with respect to the value of terminal 
wealth or as a sequence of (possibly increasing) guaranteed returns. This choice may 
be led by the conditions of the financial products linked to the fund. The design of the 
guarantee is a crucial issue and has a consistent impact on the choice of management 
strategies. 
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Not every value of minimum guarantee is reachable; no arbitrage arguments can 
be applied. The optimal design of a minimum guarantee has been considered and 
discussed in the context of pension fund management in [14]. Muermann et al. [26] 
analyses the willingness of participants in a defined contribution pension fund to pay 
for a guarantee from the point of view of regret analysis. 

Another issue which has to be tackled in the formulation is the fact that policies 
which give a minimum guaranteed return usually provide to policyholders also a 
certain amount of the return of the risky part of the portfolio invested in the equity 
market. This reduces the possibility of implementing a portfolio allocation based on 
Treasury bonds since no upside potential would be captured. The main objective is 
thus a proper combination of two conflicting goals, namely a guaranteed return, i.e., 
a low profile of risk, and at least part of the higher returns which could be granted 
by the equity market at the cost of a high exposure to the risk of not meeting the 
minimum return requirement. 

The first possibility is to divide the investment decision into two steps. In the first 
the investor chooses the allocation strategy without taking care of the guarantee, while 
in the second step he applies a dynamic insurance strategy (see for example [15]). 

Consiglio et al. [9] discuss a problem of asset and liability management for UK 
insurance products with guarantees. These products offer the owners both a minimum 
guaranteed rate of return and the possibility to participate in the returns of the risky 
part of the portfolio invested in the equity market. The minimum guarantee is treated 
as a constraint and the fund manager maximises the certainty equivalent excess return 
on equity (CEexROE). This approach is flexible and allows one to deal also with the 
presence of bonuses and/or target terminal wealth. 

Different contributions in the literature have tackled the problem of optimal portfo- 
lio choices with the presence of a minimum guarantee both in continuous and discrete 
time also from the point of view of portfolio insurance strategies both for a European 
type guarantee and for an American type guarantee, see for example [10, 11]. 

We consider the problem of formulating and solving an optimal allocation problem 
including minimum guarantee requirements and participation in the returns generated 
from the risky portfolio. These goals can be achieved both considering them as con- 
straints or including them in the objective function. In the following we will analyse 
in more detail the second case in the context of dynamic tracking error problems, 
which in our opinion provide the more flexible framework. 


3 Benchmark and tracking error issues 


The introduction of benchmarks and of indexed products has greatly increased since 
the Capital Asset Pricing Model (see [23,25,28]) promoted a theoretical basis for index 
funds. The declaration of a benchmark is particularly relevant in the definition of the 
risk profile of the fund and in the evaluation of the performance of funds’ managers. 
The analysis of the success in replicating a benchmark is conducted through tracking 
error measures. 
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Considering a given benchmark, different sources of tracking error can be analysed 
and discussed, see, for example [19]. The introduction of a liquidity component in 
the management of the portfolio, the choice of a partial replication strategy, and 
management expenses, among others, can lead to tracking errors in the replication 
of the behaviour of the index designed as the benchmark. This issue is particularly 
relevant in a pure passive strategy where the goal of the fund manager is to perfectly 
mime the result of the benchmark, while it is less crucial if we consider active asset 
allocation strategies in which the objective is to create overperformance with respect 
to the benchmark. For instance, the choice of asymmetric tracking error measures 
allows us to optimise the portfolio composition in order to try to maximise the positive 
deviations from the benchmark. For the use of asymmetric tracking error measures 
in a static framework see, for example, [16, 22,24, 27]. 

For a discussion on risk management in the presence of benchmarking, see Basak 
et al. [4]. Alexander and Baptista [1] analyse the effect of a drawdown constraint, 
introduced to control the shortfall with respect to a benchmark, on the optimality of 
the portfolios in a static framework. 

We are interested in considering dynamic tracking error problems with a stochastic 
benchmark. For a discussion on dynamic tracking error problems we refer to [2,5,7, 
13,17]. 


4 Formulation of the problem 


We consider the asset allocation problem for a fund manager who aims at maximis- 
ing the return on a risky portfolio while preserving a minimum guaranteed return. 
Maximising the upside capture increases the total risk of the portfolio. This can be 
balanced by the introduction of a second goal, i.e., the minimisation of the shortfall 
with respect to the minimum guarantee level. 

We model the first part of the objective function as the maximisation of the over- 
performance with respect to a given stochastic benchmark. The minimum guarantee 
itself can be modelled as a, possibly dynamic, benchmark. Thus the problem can be 
formalized as a double tracking error problem where we are interested in maximising 
the positive deviations from the risky benchmark while minimising the downside dis- 
tance from the minimum guarantee. The choice of asymmetric tracking error measures 
allows us to properly combine the two goals. 

To describe the uncertainty, in the context of a multiperiod stochastic programming 
problem, we use a scenario tree. A set of scenarios is a collection of paths from t = 0 
to T, with probabilities zz, associated to each node k; in the path: according to 
the information structure assumed, this collection can be represented as a scenario 
tree where the current state corresponds to the root of the tree and each scenario is 
represented as a path from the origin to a leaf of the tree. 

If we fix it as a minimal guaranteed return, without any requirement on the upside 
capture we obtain a problem which fits the portfolio insurance framework, see, for 
example, [3,6, 18,21,29]. For portfolio insurance strategies there are strict restrictions 
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on the choice of the benchmark, which cannot exceed the return on the risk-free 
security for no arbitrage conditions. 

Let xx, be the value of the risky benchmark at time fin node k;; z; is the value of the 
lower benchmark, the minimum guarantee, which can be assumed to be constant or 
have deterministic dynamics, thus it does not depend on the node k;. We denote with 
yx, the value of the managed portfolio at time t in node k;. Moreover let dx, (ye, , Xk,) be 
a proper tracking error measure which accounts for the distance between the managed 
portfolio and the risky benchmark, and wz, (yx,, 2) a distance measure between the 
risky portfolio and the minimum guarantee benchmark. The objective function can 
be written as 


T Ky K; 
max Di} or Db Xk) — Be Ds ies 20) (1) 
oP =O Reidel y=K,-1+1 


where a; and f; represent sequences of positive weights which can account both 
for the relative importance of the two goals in the objective function and for a time 
preference of the manager. For example, if we consider a pension fund portfolio 
management problem we can assume that the upside capture goal is preferable at 
the early stage of the investment horizon while a more conservative strategy can be 
adopted at the end of the investment period. A proper choice of ¢; and y; allows us 
to define different tracking error problems. 

The tracking error measures are indexed along the planning horizon in such a way 
that we can monitor the behaviour of the portfolio at each trading date rt. Other for- 
mulations are possible. For example, we can assume that the objective of a minimum 
guarantee is relevant only at the terminal stage where we require a minimum level of 
wealth zr 


T K; Kr 
max Do} ar DD) Pees 2) | — Br DY) Wer Vers 7). 2) 
420 | =K;141 krp=Ky-1+1 


The proposed model can be considered a generalisation of the tracking error model 
of Dembo and Rosen [12], who consider as an objective function a weighted average 
of positive and negative deviations from a benchmark. In our model we consider two 
different benchmarks and a dynamic tracking problem. 

The model can be generalised in order to take into account a monitoring of the 
shortfall more frequent than the trading dates, see Dempster et al. [14]. 

We consider a decision maker who has to compose and manage his portfolio 
using n = nj + nz risky assets and a liquidity component. In the following q;x,, 


i = 1,...,1, denotes the position in the ith stock and b;,,,j = 1, ...,2 denotes 
the position in the jth bond while cx, denotes the amount of cash. 
We denote with rz, = ("1k,,---, nk, ) the vector of returns of the risky assets for 


the period [t — 1, f] in node k; and with r,,, the return on the liquidity component 
in node k;. In order to account for transaction costs and liquidity component in the 
portfolio we introduce two vector of variables ay, = (a14,,.--,@nk,) and og, = 
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(01k,,-+-+» Unk, ) denoting the value of each asset purchased and sold at time ¢ in node 
k,, while we denote with xt and «7 the proportional transaction costs for purchases 
and sales. 

Different choices of tracking error measures are possible and different trade-offs 
between the goals on the minimum guarantee side and on the enhanced tracking error 
side, for the risky benchmark, are possible, too. In this contribution we do not tackle 
the issue of comparing different choices for tracking error measures and trade-offs 
in the goals with respect to the risk attitude of the investor. Among different possible 
models, we propose the absolute downside deviation as a measure of tracking error 
between the managed portfolio and the minimum guarantee benchmark, while we 
consider only the upside deviations between the portfolio and the risky benchmark 


Pk, Vis Xk) = [ye — xe, = OF; (3) 
Whe (Vir > 21) = Lk — Zt = 0%,» (4) 
where [yx, — xx, ]* = max[yx, — xg,, 0] and [yz, — ze]7 = — min[yg, — zr, 0]. The 


minimum guarantee can be assumed constant over the entire planning horizon or it can 
follow a deterministic dynamics, i.e, it is not scenario dependent. Following [14] we 
assume that there is an annual guaranteed rate of return denoted with p. If the initial 
wealth is Wo = Se. xjo, then the value of the guarantee at the end of the planning 
horizon is Wr = Wo(1 +p). At each intermediate date the value of the guarantee 
is given by z; = e%.7-9 7-9) Wo(1 + p)”, where e%.7-) 7 is a discounting factor, 
i.e., the price at time ¢ of a zcb which pays | at terminal time 7. 

The objective function becomes a weighted trade-off between negative deviations 
from the minimum guarantee and positive deviations from the risky benchmark. Given 
the choice for the minimum guarantee, the objective function penalises the negative 
deviations from the risky benchmark only when these deviations are such that the 
portfolio values are below the minimum guarantee and penalises them for the amounts 
which are below the minimum guarantee. Thus, the choice of the relative weights for 
the two goals is crucial in the determination of the level of risk of the portfolio strategy. 

The obtained dynamic tracking error problem in its arborescent form is 


‘a Ky Ki 
fe = 
nee > Or > O., — Bi >: Vk; ©) 
Thy Pky Chr ky=K;-1+1 k,=K;-1+1 
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n\ ny 
Vey = Ck, + >" ik, + >. dik, (8) 
j=l 


i=1 
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dik, > 0 viz, >OTHI1,...,01 (12) 
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where equation (8) represents the portfolio composition in node k;; equations (9)— 
(11) describe the dynamics of the amounts of stocks, bonds and cash in the portfolio 
moving from the ancestor node f (K; ), at time f — 1, to the descendent nodes k;, at time 
t, with Ko = 0. In equation (11), with gx, we denote the inflows from the bonds in the 
portfolio. Equation (16) represents the complementarity conditions which prevent 
positive and negative deviations from being different from zero at the same time. 
Equations (20)—(22) give the initial endowments for stocks, bonds and cash. 

We need to specify the value of the benchmark and the value of the minimum 
guarantee at each time and for each node. The stochastic benchmark y;, and the prices 
of the risky assets in the portfolio must be simulated according to given stochastic 
processes in order to build the corresponding scenario trees. Other dynamics for the 
minimum guaranteed level of wealth can be designed. In particular, we can discuss 
a time-varying rate or return p; along the planning horizon, or we can include the 
accrued bonuses as in [8]. 

A second approach to tackle the problem of the minimum return guarantee is 
to introduce probabilistic constraints in the dynamic optimisation problem. Denot- 
ing with @ the desired confidence level, we can formulate the shortfall constraints 
both on the level of wealth at an intermediate time ¢ and on the terminal wealth as 
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follows 
Pr(W, <7) <1-@ Pr(Wr <zr)<1-8@ 


where W; is the random variable representing the level of wealth. Under the as- 
sumption of a discrete and finite number of realisations we can compute the shortfall 
probability using the values of the wealth in each node W;, = So Xik,- This gives 
rise to a chance constrained stochastic optimisation problem which can be extremely 
difficult to solve due to non-convexities which may arise, see [14]. 


5 Conclusions 


We discuss the issue of including in the formulation of a dynamic portfolio optimisa- 
tion problem both a minimum return guarantee and the maximisation of the potential 
returns from a risky portfolio. To combine these two conflicting goals we formulate 
them in the framework of a double dynamic tracking error problem using asymmetric 
tracking measures. 
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Energy markets: crucial relationship between prices 


Cristina Bencivenga, Giulia Sargenti, and Rita L. D’Ecclesia 


Abstract. This study investigates the relationship between crude oil, natural gas and electricity 
prices. A possible integration may exist and it can be measured using a cointegration approach. 
The relationship between energy commodities may have several implications for the pricing of 
derivative products and for risk management purposes. Using daily price data for Brent crude 
oil, NBP UK natural gas and EEX electricity we analyse the short- and long-run relationship 
between these markets. An unconditional correlation analysis is performed to study the short- 
term relationship, which appears to be very unstable and dominated by noise. A long-run 
relationship is analysed using the Engle-Granger cointegration framework. Our results indicate 
that gas, oil and electricity markets are integrated. The framework used allows us to identify a 
short-run relationship. 


Key words: energy commodities, correlation, cointegation, market integration 


1 Introduction 


Energy commodities have been a leading actor in the economic and financial scene in 
the last decade. The deregulation of electricity and gas markets in western countries 
caused a serious change in the dynamic of electricity and gas prices and necessitated 
the adoption of adequate risk management strategies. The crude oil market has also 
been also experiencing serious changes over the last decade caused by economic 
and political factors. The deregulation of gas and electricity markets should cause, 
among other things, more efficient price formation of these commodities. However 
their dependence on oil prices is still crucial. An analysis of how these commodities 
are related to each other represents a milestone in the definition of risk measurement 
and management tools. 

For years natural gas and refined petroleum products have been used as close 
substitutes in power generation and industry. As a consequence, movements of natural 
gas prices have generally tracked those of crude oil. This brought academics and 
practitioners to use a simple rule of thumb to relate natural gas prices to crude oil 
prices according to which a simple deterministic function may be able to explain the 
relationships between them (see, e.g., [7]). Recently the number of facilities able to 
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switch between natural gas and residual fuel oil has declined, so gas prices seem to 
move more independently from oil prices. However, to a certain extent, oil prices are 
expected to remain the main drivers of energy prices through inter-fuel competition 
and price indexation clauses in some long-term gas contracts. 

Finally, the high price volatility in the energy commodity markets boosted the 
development of energy derivative instruments largely used for risk management. In 
particular, spread options have been largely used, given that the most useful and 
important structure in the world of energy is represented by the spread.! The joint 
behaviour of commodity prices as well as gas, oil and electricity, is crucial for a 
proper valuation of spread contracts. This requires a real understanding of the nature 
of volatility and correlation in energy markets. 

The aim of this paper is twofold. First, to investigate the short-run relationship 
between oil, natural gas and electricity in the European energy markets. Second, to 
identify possible long-run equilibrium relationships between these commodities. In 
particular we test for shared price trends, or common trends, in order to detect if 
natural gas and electricity are driven by a unique source of randomness, crude oil. In 
a financial context the existence of cointegrating relationships implies no arbitrage 
opportunity between these markets as well as no leading market in the price discovery 
process. This is going to be a key feature for the definition of hedging strategies also 
for energy markets, given the recent deregulation process of the gas and the electricity 
market in Europe. 

The paper is organised as follows. Section 2 provides an overview of the rele- 
vant literature on this topic. Section 3 describes the data set given by daily prices of 
electricity, oil and natural gas for the European market over the period 2001-2007 and 
examines the annualised quarterly volatilities of each time series. In Sections 4 and 
5 current state of the art methodologies are used to analyse the short- and long-run 
relationships, as well as a rolling correlation and cointegration approach. Section 6 
draws some preliminary conclusions. 


2 Relevant literature 


Economic theory suggests the existence of a relationship between natural gas and 
oil prices. Oil and natural gas are competitive substitutes and complements in the 
electricity generation and in industrial production. Due to the asymmetric relationship 
in the relative size of each market, past changes in the price of oil caused changes in 
the natural gas market, but the converse did not hold [17]. 

The relationship between natural gas and crude oil has been largely investigated. 
In [14] UK gas and Brent oil prices over the period 1996-2003 have been analysed. 
In [3] the degree of market integration both among and between the crude oil, coal, 
and natural gas markets in the US has been investigated. A longer time period 1989-— 
2005, is used in [17] where a cointegration relationship between oil and natural gas 


! Spreads are price differentials between two commodities and are largely used to describe 
power plant refineries, storage facilities and transmission lines. For an extensive description 
of energy spread options, see [6]. 
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prices has been found despite periods where they may have appeared to decouple. 
A cointegration relationship between the prices of West Texas Intermediate (WTI) 
crude oil and Henry Hub (HH) natural gas has been examined in [4] and [9]. 

Analysis of the relationship between electricity and fossil fuel prices has only 
been performed at regional levels and on linked sets of data given the recent in- 
troduction of spot electricity markets. Serletis and Herbert [15] used the dynamics 
of North America natural gas, fuel oil and power prices from 1996 to 1997 to find 
that the HH and Transco Zone 6 natural gas prices and the fuel oil price are coin- 
tegrated, whereas power prices series appears to be stationary. In [8] the existence 
of a medium- and long-term correlation between electricity and fuel oil in Europe 
is analysed. [2] investigates the dynamic of gas, oil and electricity during an interim 
period 1995-1998: deregulation of the UK gas market (1995) and the opening up of 
the Interconnector (1998). Cointegration between natural gas, crude oil and electric- 
ity prices is found and a leading role of crude oil is also identified. More recently, 
using a multivariate time series framework, [13] interrelationships among electricity 
prices from two diverse markets, Pennsylvania, New Jersey, Maryland Interconnec- 
tion (PJM) and Mid-Columbia (Mid-C), and four major fuel source prices, natural 
gas, crude oil, coal and uranium, in the period 2001-2008, are examined. 

To the best of our knowledge the level of integration between the gas, oil and 
electricity markets in the European market has not been investigated. The purpose of 
this study is mainly to perform such analysis in order to verify if an integrated energy 
market can be detected. 


3 The data set 


Time series for the daily prices of ICE Brent crude oil,” natural gas at the National 
Balancing Point (NBP) UK? and European Energy Exchange (EEX) electricity* are 
used for the period September 2001 — December 2007. 

Oil prices are expressed in US$/barrel per day (bd), gas in UK p/therm and elec- 
tricity prices in €/Megawatt hour (MWh). We convert all prices into €/MWh using 
the conversion factors for energy content provided by the Energy Information Ad- 
ministration (EIA).° The dynamics of the energy prices are represented into Figure 1. 

Following the standard literature we perform a finer analysis of the volatility of 
each price series by estimating the annualised quarterly volatilities 0; = oj, /250, 


2 Brent blend is the reference crude oil for the North Sea and is one of the three major 
benchmarks in the international oil market [7]. 

3 The NBP is the most liquid gas trading point in Europe. The NBP price is the reference 
for many forward transactions and for the International Petroleum Exchange (IPE) Future 
contracts [7]. 

4 BEX is one of the leading energy exchanges in central Europe [7]. For the purpose of our 
analysis peak load prices have been used. 

5 According to EIA conversion factors, | barrel of crude oil is equal to 1.58 MWh. 
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Fig. 1. Crude oil, natural gas and electricity prices, 2001-2007 


i=1,...,25, N = 60, where 
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The oil price volatility swings between 21% and 53%, confirming the non-stationarity 
of the data. The same non-stationarity characterises the data of natural gas, fluctuating 
between 65% and 330%. Electricity prices, as expected, were far more volatile than 
oil and gas prices,° with a range of quarterly volatility which swings between around 
277% and 868%. 

A preliminary analysis is going to be performed on the stationarity of the time 
series. In line with most of the recent literature we transform the original series in 
logs. First we test the order of integration of a time series using the Augmented 
Dickey-Fuller (ADF) type regression: 


k 
Ayr =aotait+yy-1t > BjAy-j +e (1) 
j=l 


6 Seasonality and mean reversion are common features in commodity price dynamics; in 
addition a jump component has to be included when describing electricity prices. 
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where Ay; = y;—y;—1 and the lag length k is automatic based on Scharwz information 
criterion (SIC). The results of the unit root test are reported in Table 1. 


Table 1. Unit root test results for the logged price series 


Series ty TO T| Td Decision 
Oil 1.17 (1) —0.70 (1) —2.73 (1) —43.2 (1) T(1) 
Gas —0.30 (6) —3.37* (6) —5.58** (2) —21.1 (1) T(1) 
Elect —0.06 (14) —3.41 (14) —4.36** (14) —18.4 (1) T(1) 


The 5% significance levels are —1.94 for ADF without exogenous variables, —2.86 for ADF 
with a constant and —3.41 for ADF with a constant and trend. (*) denotes acceptance of the 
null at 1%, («*) denotes rejection of the null at the conventional test sizes. The SIC-based 
optimum lag lengths are in parentheses. 


We run the test without any exogenous variable, with a constant and a constant plus 
a linear time trend as exogenous variables in equation (1). The reported f-statistics are 
ty, tT and 71, respectively. tq is the t-statistic for the ADF tests in first-differenced data. 
ty is greater than the critical values but we reject the hypothesis in first-difference, 
hence we conclude that the variables are first-difference stationary (i.e., all the series 
are [(1)). 


4 The short-run relationship 


Alexander [1] presents the applications of correlation analysis to the crude oil and 
natural gas markets. Correlation measures co-movements of prices or returns and can 
be considered a short-term measure. It is essentially a static measure, so it cannot 
reveal any dynamic causal relationship. In addition, estimated correlations can be 
significantly biased or nonsense if the underlying variables are polynomials of time 
or when the two variables are non-stationary [18]. 

To analyse a possible short-run relationship among the variables, we estimate a 
rolling correlation over t; = 100 days’ according to: 


SHTj AN AN 
aa (i — BQ — 9) 
ps[x, y] = —————-_. s= l.,..., T-1t, (2) 
OxOy 


where T = 1580 (the entire period 2001-2007), and o; and Gy are the standard 
deviations of x and y, estimated on the corresponding time window. 

Correlation changes over time, as expected, given the non-stationarity of the un- 
derlying processes. Volatilities of commodity prices are time dependent and so are the 
covariance and the unconditional correlation. This means that we can only attempt 


7 This window period is suggested in [6]. We also perform the analysis with larger windows 
(t; = 100, 150, 200 days), getting similar results. 
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to catch seasonal changes in correlations when interpreting the rolling correlation 
coefficient. The unconditional correlation coefficients,® pr, together with the main 
statistical features of the rolling correlations, ps, s = 1,..., 7, between the energy 
price series are reported in Table 2. It is interesting to notice that the rolling correla- 
tions between gas and oil show some counterintuitive behaviour. 


Table 2. Unconditional correlation and rolling correlations between log prices 


Matrices PT E(ps) a (ps) Max(ps) Min(ps) 
Oil/Elect 0.537 0.0744 0.260 0.696 —0.567 
Gas/Elect 0.515 0.119 0.227 0.657 —0.280 
Oil/Gas 0.590 -0.027 0.426 0.825 —0.827 


These results do not provide useful insights into the real nature of the relationship 
between the main commodities of the energy markets. 


5 The long-run relationship 


Table 1 confirms a stochastic trend for all the price series; a possible cointegration 
relationship among the energy commodity prices may therefore be captured (i.e., the 
presence of a shared stochastic trend or common trend). Two non-stationary series are 
cointegrated if a linear combination of them is stationary. The vector which realises 
such a linear combination is called the cointegrating vector. 

We examine the number of cointegrating vectors by using the Johansen method 
(see [10] and [11]). For this purpose we estimate a vector error correction model 
(VECM) based on the so-called reduced rank regression method (see [12]). Assume 
that the n-vector of non-stationary /(1) variables Y, follows a vector autoregressive 
(VAR) process of order p, 


Y,; = Ai Y¥;-1 + AoYj-2 +... + ApYi-p + & (3) 


with €; as the corresponding n-dimensional white noise, andn x n Aj,i=1,...,D 
matrices of coefficients.” Equation (3) is equivalently written ina VECM framework, 


AY, = Di AY,-; + D2 AY;-2 +--+ Dp AY;-ptit DYi-1+ 4 (4) 


where Dj = —(Aj41+---+Ap),i = 1,2,..., p—land D = (Aj +---+Ap—Jn). 
The Granger’s representation theorem [5] asserts thatif Dhas reduced rankr € (0, 7), 
then n x r matrices T and B exist, each with rank r, such that D = —I B’ and B’Y, is 
I (0).r is the number of cointegrating relations and the coefficients of the cointegrating 
vectors are reported in the columns of B. 
The cointegration results for the log prices are shown in Table 3. 

cov(x,y) 

Ox0y 
9 Tn the following, for the VAR(p) model we exclude the presence of exogenous variables. 


8 The unconditional correlation for the entire period is given by py = 
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Table 3. Cointegration rank test (trace and maximum eigenvalue) 


Nr. of coint. vec. Eigenvalue Atrace hace Amax ree 
r=0 0.121 2312 29.79 240.3 21.13 
r<l 0.020 32.91 15.49 32.24 14.26 
r<2 0.000 0.672 3.841 0.672 3.841 


A rejection of the null ‘no cointegrated’ relationship and ‘r at most 1’ in favour of 
‘r at most 2’ at the 5% significance level is provided. This provides evidence of the 
existence of two cointegrating relationships among the three commodity price series. 
In a VECM framework, the presence of two cointegrating vectors, r = 2, on a set of 
n = 3 variables allows the estimation of an —r = 1 common (stochastic) trend [16]. 
The common trend may be interpreted as a source of randomness which affects the 
dynamics of the commodity prices. In this case we may assume oil prices represent 
the leading risk factor in the energy market as a whole. 

To better analyse the dynamics of the markets we use the Engle-Granger [5] two- 
step methodology. This method consists in estimating each cointegrating relationship 
individually using ordinary least squares (OLS) and then including the errors from 
those cointegrating equations in short-run dynamic adjustment equations which allow 
the explanation of adjustment to the long-run equilibrium. The first step is to estimate 
the so-called cointegrating regression 


Vir = a+ Pyrr + 2 (5) 


where y,; and y2,, are two price series, both integrated of order one, and z; denotes 
the OLS regression residuals. We perform the test twice for each couple of time series 
using as dependent variable both of the series. For each couple of time series, using 
both of the series as dependent variables. The results are reported in Table 4. The 
null hypothesis of no cointegration is rejected at the 8% significance level for the 
regression oil vs electricity, at the 1 % level in all the other cases. The coefficients 6 
in equation (5), which represent the factors of proportionality for the common trend, 
are estimated by OLS. 

According to the Granger representation theorem, if two series cointegrate, the 
short-run dynamics can be described by the ECM. The basic ECM proposed in [5] 
can be written as follows: 


Ayr = bAy21 + OQ t-1 — & — By27-1) + & (6) 


where (y1,+-1 —@ — fy2,,-1) represents the error correction term z;—1 of equation (5), 
¢ measures the contemporaneous price response,!° 6 represents the speed of the 
adjustment towards the long-term cointegrating relationship, and €; ~ i.i.d.(0, Z). 


10 The parameter ¢ approximates the correlation coefficient between first differences in prices 
(Ay;,¢ and Ay;,,) and it will be close to 1 when the two commodities are in the same market. 
Therefore, a higher value of ¢ is a sign of a stronger integration of the market [3]. 
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Table 4. Engle and Granger cointegration test 


Dep. variable Indep. variable B tg p-value 
Elect Gas 0.514 23.90 0.00 
Gas Elect 0.516 0.00 
Elect Oil 0.732 25.33 0.00 
Oil Elect 0.394 0.08 
Oil Gas 0.432 29.03 0.00 
Gas Oil 0.805 0.00 


tg are the t-statistics for the coefficients # in equation (5). The last column reports the p-values 
for the unit root tests on the regression residuals. 


Cointegration tests per se do not focus on the economically interesting parameters a, 
f, f and @ [3]. The ECM highlights that the deviations from the long-run cointegrating 
relationship are corrected gradually through a series of partial short-run adjustments. 
In the long run equilibrium the error correction term will be equal to zero. However, if 
the variables y;,; and y2,, deviate from the long-run equilibrium, the error correction 
term will be different from zero and each variable adjusts to restore the equilibrium 
relation whose speed of adjustment is represented by 0. 

The results reported in Table 5 highlight no significative value for coefficient ¢ 
in any cases. Therefore we apply an ECM using a different lag for the independent 
variable. 


Table 5. Estimated speed of adjustment parameters for the ECM 


Dep. variable Indep. variable p ty p-value 0 to p-value 
A Elect A Gas 0.010 0.150 0.880  -—0.452 -—21.54 0.00 
A Elect A Oil —0.427 -1.059 0.289 -—0.461 —21.71 0.00 
A Gas A Oil 0.028 0.189 0.849  -—0.053 -—6.553 0.00 


For electricity and gas, significative coefficients ¢ and @ are found (¢ = 0.25, 
@ = 0.46) with a lag of two days, indicating that in addition to a long-run relationship a 
short-run influence exists among the two series. For the pair electricity/oil, considering 
the independent variable with a five-day lag, a significative coefficient, 6 = 0.68 (9% 
level), is found whereas 6 = 0.452; also in this case, the price adjustment in the short 
run is detected with a lag of five days. For the pair gas/oil a significative coefficient 
¢ is found (6 = 0.29) at the 5% level with a lag of six days. # is equal to 0.05, 
showing that the speed adjustment to the long-run equilibrium is particularly low. 
The presence of a short-run relationship among the various commodities may also be 
explained by the fact that the analysis refers to European markets where deregulation 
has not been completely performed yet. Some of the markets still experience market 
power and in this context the oil prices may still represent the leading actor for the 
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electricity and gas price formation. The misalignment between oil and gas in the short 
run may depend on different forces (i.e., the supply of gas from Algeria or Russia) 
that may provide some independent source of randomness for natural gas prices. This 
may explain why, especially in turbulent periods, gas and oil tend to have different 
dynamics, while natural gas prices follow crude oil in the long run. 


6 Conclusions 


This paper analyses the dynamics of the prices of oil, electricity and natural gas in the 
European markets in order to estimate the nature of the existing relationship among 
them. The simple correlation analysis among the various time series is non-effective 
given the non-stationarity of the data. A cointegration approach is chosen to measure 
a possible integration among the markets. 

A cointegration relationship among each pair of commodities is found using the 
Engle-Granger approach. The Johansen cointegration test reports that oil, gas and 
electricity prices are all cointegrated. Two further integrating equations are found, 
implying that one common trend is present in the energy market. From an economic 
point of view this can be interpreted as a simple source of risk (the oil market), which 
affects the dynamics of the two other commodities (electricity and gas). 
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Tempered stable distributions and processes in 
finance: numerical analysis 


Michele Leonardo Bianchi*, Svetlozar T. Rachev, Young Shin Kim, and 
Frank J. Fabozzi 


Abstract. Most of the important models in finance rest on the assumption that randomness 
is explained through a normal random variable. However there is ample empirical evidence 
against the normality assumption, since stock returns are heavy-tailed, leptokurtic and skewed. 
Partly in response to those empirical inconsistencies relative to the properties of the normal 
distribution, a suitable alternative distribution is the family of tempered stable distributions. 
In general, the use of infinitely divisible distributions is obstructed the difficulty of calibrating 
and simulating them. In this paper, we address some numerical issues resulting from tempered 
stable modelling, with a view toward the density approximation and simulation. 


Key words: stable distribution, tempered stable distributions, Monte Carlo 


1 Introduction 


Since Mandelbrot introduced the a-stable distribution in modelling financial asset 
returns, numerous empirical studies have been done in both natural and economic 
sciences. The works of Rachev and Mittnik [19] and Rachev et al. [18] (see also 
references therein), have focused attention on a general framework for market and 
credit risk management, option pricing, and portfolio selection based on the a-stable 
distribution. While the empirical evidence does not support the normal distribution, it 
is also not always consistent with the a-stable distributional hypothesis. Asset returns 
time series present heavier tails relative to the normal distribution and thinner tails than 
the a-stable distribution. Moreover, the stable scaling properties may cause problems 
in calibrating the model to real data. Anyway, there is a wide consensus to assume 
the presence of a leptokurtic and skewed pattern in stock returns, as showed by the 
a-stable modelling. Partly in response to the above empirical inconsistencies, and to 
maintain suitable properties of the stable model, a proper alternative to the a-stable 
distribution is the family of tempered stable distributions. 

Tempered stable distributions may have all moments finite and exponential mo- 
ments of some order. The latter property is essential in the construction of tempered 
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stable option pricing models. The formal definition of tempered stable processes has 
been proposed in the seminal work of Rosifski [21]. The KoBol (Koponen, Bo- 
yarchenko, Levendorskii) [4], CGMY (Carr, Geman, Madan, Yor) [5], Inverse Gaus- 
sian (IG) and the tempered stable of Tweedie [22] are only some of the parametric 
examples in this class that have an infinite-dimensional parametrisation by a family 
of measures [24]. Further extensions or limiting cases are also given by the fractional 
tempered stable framework [10], the bilateral gamma [15] and the generalised tem- 
pered stable distribution [7] and [16]. The general formulation is difficult to use in 
practical applications, but it allows one to prove some interesting results regarding the 
calculus of the characteristic function and the random number generation. The infinite 
divisibility of this distribution allows one to construct the corresponding Lévy process 
and to analyse the change of measure problem and the process behaviour as well. 

The purpose of this paper is to show some numerical issues arising from the use 
of this class in applications to finance with a look at the density approximation and 
random number generation for some specific cases, such as the CGMY and the Kim- 
Rachev (KR) case. The paper is related to some previous works of the authors [13, 14] 
where the exponential Lévy and the tempered stable GARCH models have been 
studied. The remainder of this paper is organised as follows. In Section 2 we review 
the definition of tempered stable distributions and focus our attention on the CGMY 
and KR distributions. An algorithm for the evaluation of the density function for 
the KR distribution is presented in Section 3. Finally, Section 4 presents a general 
random number generation method and an option pricing analysis via Monte Carlo 
simulation. 


2 Basic definitions 


The class of infinitely divisible distribution has a large spectrum of applications and 
in recent years, particularly in mathematical finance and econometrics, non-normal 
infinitely divisible distributions have been widely studied. In the following, we will 
refer to the Lévy-Khinchin representation with Lévy triplet (ay, o, v) as in [16]. Let 
us now define the Lévy measure of a TS, distribution. 


Definition 1 A real valued random variable X is TS, if is infinitely divisible without 
a Gaussian part and has Lévy measure v that can be written in polar coordinated 


v(dr, dw) =r~*~'q(r, w)dr o (dw), (1) 
where a & (0, 2) and o is a finite measure on S4-! and 
q: (0, 00) x S41 (0, 00) 


is a Borel function such that q(-, w) is completely monotone with q(co, w) = 0 for 
each w € S4—!. A TS, distribution is called a proper TS distribution if 


lim g(r, w) =1 
r>0t 


for each w € S47}, 
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Furthermore, by theorem 2.3 in [21], the Lévy measure v can be also rewritten in 
the form 


CO 
v(A) = hed Ia(tx)at~*—'e"dtR(dx), A € B(R®), (2) 
Rd Jo 
where R is a unique measure on R¢ such that R({0}) = 0 


[ast A ||x|I*)R(dx) < 00, a € (0,2). (3) 


Sometimes the only knowledge of the Lévy measure cannot be enough to obtain 
analytical properties of tempered stable distributions. Therefore, the definition of 
Rosinski measure R allows one to overcome this problem and to obtain explicit ana- 
lytic formulas and more explicit calculations. For instance, the characteristic function 
can be rewritten by directly using the measure R instead of v (see theorem 2.9 in [21]). 
Of course, given a measure R itis always possible to find the corresponding tempering 
function q; the converse is true as well. As a consequence of this, the specification of 
a measure R satisfying conditions (3), or the specification of a completely monotone 
function g, uniquely defines a TS, distribution. 

Now, let us define two parametric examples. In the first example the measure R is 
the sum of two Dirac measures multiplied for opportune constants, while the spectral 
measure R of the second example has a nontrivial bounded support. If we set 


g(r, +1) =e"**",  A>0, (4) 
and the measure 
o({-1}) =c- and o({I}) =c4, (5) 
we get 
oe es Ch Ayr 
vr) = pire’ Kr<o} + read Ir>0}- (6) 


The measures Q and R are given by 
O = c_0_,_ +40), (7) 


and 
R= CaAtO. i +e,4961, (8) 
Am a4 


where 0, is the Dirac measure at J (see [21] for the definition of the measure Q). 
Then the characteristic exponent has the form 


y(u) = iub + T(—a)e4 (A — iu)* — 2% + iad Sl) 


ye (9) 
+0 (-a)c_ (A + iu)® — 2% —iad= nu), 


where we are considering the Lévy-Khinchin formula with truncation function h(x) = 
x. This distributionis usually referred to as the KoBoL or generalised tempered stable 
(GTS) distribution. If we take 44 = M,A_ = G,cy =c_ =C,a=Yandm=b, 
we obtain that X is CGMY distributed with expected value m. The definition of the 
corresponding Lévy process follows. 
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Definition 2 Let X; be the process such that Xo = 0 and Efe!"*] = eV) where 


w(u) = ium +T(—Y)C((M — iu)” — MY +iYM*—!n) 
41 (-Y)C(G + iu)’ — G’ —iyG*—!u). 
We call this process the CGMY process with parameter (C, G, M, Y, m) where 
m= E[X}]. 


A further example is given by the KR distribution [14], with a Rosinsky measure 
of the following form 


R(dx) = (ky ry ?* Tory Gd ixlPt! + kro) |x|?) dx, (10) 


where a € (0, 2), ki, k_~,r4,r- > 0, p+, p— € (—a, oo) \ {-1, 0}, andm € R. 
The characteristic function can be calculated by theorem 2.9 in [21] and is given in 
the following result [14]. 


Definition 3 Let X; be a process with Xo = 0 and corresponding to the spectral 
measure R defined in (10) with conditions p # 0, p # —1 anda F 1, and let 
m = E[X\]. By considering the Lévy-Khinchin formula with truncation function 
h(x) = x, we have E[e!"*'] = e'¥™ with 


kT (-a : iap+sr.u 
y(u) = kT (a) (eile. as 1+ pss irgu)—14+ | 
P+ p+t+1 
kT : (11) 
—P(a) (2p. Lipa i | Ree 
= p-+1 


where 2 F\ (a, b; c; x) is the hypergeometric function [1]. We call this process the KR 
process with parameter (k1, k_, rz, r-, p+, P—, @, m). 


3 Evaluating the density function 


In order to calibrate asset returns models through an exponential Lévy process or 
tempered stable GARCH model [13, 14], one needs a correct evaluation of both the 
pdf and cdf functions. With the pdf function it is possible to construct a maximum 
likelihood estimator (MLE), while the cdf function allows one to assess the goodness 
of fit. Even if the MLE method may lead to a local maximum rather than to a global 
one due to the multidimensionality of the optimisation problem, the results obtained 
seem to be satisfactory from the point of view of goodness-of-fit tests. Actually, an 
analysis of estimation methods for this kind of distribution would be interesting, but 
it is far from the purpose of this work. 

Numerical methods are needed to evaluate the pdf function. By the definition of 
the characteristic function as the Fourier transform of the density function [8], we 
consider the inverse Fourier transform that is 


f= ak e Efel* du (12) 
2a JR 
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where f(x) is the density function. If the density function has to be calculated for a 
large number of x values, the fast Fourier Transform (FFT) algorithm can be employed 
as described in [23]. The use of the FFT algorithm largely improves the speed of the 
numerical integration above and the function f is evaluated on a discrete and finite 
grid; consequently a numerical interpolation is necessary for x values out of the grid. 
Since a personal computer cannot deal with infinite numbers, the integral bounds 
(—oo, 00) in equation (12) are replaced with [—M, M], where M is a large value. We 
take M ~ 2!© or 2!5 in our study and we have also noted that smaller values of M 
generate large errors in the density evaluation given by a wave effect in both density 
tails. We have to point out that the numerical integration as well as the interpolation 
may cause some numerical errors. The method above is a general method that can be 
used if the density function is not known in closed form. 

While the calculus of the characteristic function in the CGMY case involves only 
elementary functions, more interesting is the evaluation of the characteristic function 
in the KR case that is connected with the Gaussian hypergeometric function. Equation 
(11) implies the evaluation of the hypergeometric 2 F; (a, b; c; z) function only on the 
straight line represented by the subset J = {iy | y € R} of the complex plane C. We 
do not need a general algorithm to evaluate the function on the entire complex plane 
C, but just on a subset of it. This can be done by means of the analytic continuation, 
without having recourse either to numerical integration or to numerical solution of a 
differential equation [17] (for a complete table of the analytic continuation formulas 
for arbitrary values of z € C and of the parameters a, b, c, see [3] or [9]). The 
hypergeometric function belongs to the special function class and often occurs in 
many practical computational problems. It is defined by the power series 


— @)n On 2" 
2Fi(a, b,c; 2) = > a S ld<4, (13) 
n=0 n : 


where (a), := T(a+n)/T(n) is the Ponchhammer symbol (see [1]). By [1] the 
following relations are fulfilled 

) if 

—1 


2F (a,b,c; z)= a («. a-ct+la—b+l, -) 


<1 


2F (a, b,c; 2) = —-2) oF (>. c—4a,c, 
Zz 


Vi 


_,l(c)Pa-—b) 1 
b 
— —————5F, |b, b-— l,b- 1,- 
+(—z) Tee-bra@yt \” a a a 

: 1 

if |-| <1 

Zz 

2F\ (a, b, Cc; —iy) = 2F\ (a, b, Cc; iy) if y ER. (14) 


First by the last equality of (14), one can determine the values of 2 Fi (a, b, c; z) only 
for the subset J, = {iy | y € R+} and then simply consider the conjugate for the 
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set J. = {iy | y € R_}, remembering that 2 Fj (a, b, c; 0) = 1. Second, in order to 
obtain a fast convergence of the series (13), we split the positive imaginary line into 
three subsets without intersection, 


I! = {iy |0<y <0.5} 


2 ={iy|05 <y< 1.5} 
iy bys 15); 


then we use (13) to evaluate Fi (a, b,c; z) in ae Then, the first and the second 
equalities of (14) together with (13) are enough to evaluate 2 F(a, b, c; z) in 1 and 
Se respectively. This subdivision allows one to truncate the series (13) to the integer 
N = 500 and obtain the same results as Mathematica. We point out that the value of 
y ranges in the interval [-M, M]. This method, together with the MATLAB vector 
calculus, considerably increases the speed with respect to algorithms based on the 
numerical solution of the differential equation [17]. Our method is grounded only on 
basic summations and multiplication. As a result the computational effort in the KR 
density evaluation is comparable to that of the CGMY one. The KR characteristic 
function is necessary also to price options, not only for MLE estimation. Indeed, 
by using the approach of Carr and Madan [6] and the same analytic continuation as 
above, risk-neutral parameters may be directly estimated from option prices, without 
calibrating the underlying market model. 


4 Simulation of TS, processes 


In order to generate random variate from TS, processes, we will consider the gen- 
eral shot noise representation of proper TS, laws given in [21]. There are different 
methods to simulate Lévy processes, but most of these methods are not suitable for 
the simulation of tempered stable processes due to the complicated structure of their 
Lévy measure. As emphasised in [21], the usual method of the inverse of the Lévy 
measure [20] is difficult to implement, even if the spectral measure R has a simple 
form. We will apply theorem 5.1 from [21] to the previously considered parametric 
examples. 


Proposition 1 Let {U;} and {T;} be i.i.d. sequences of uniform random variables in 
(0, 1) and (0, T) respectively, {Ej} and {E%} iid. sequences of exponential variables 
of parameter | and {Tj} = E,+...+ E'., {Vj} ani.i.d. sequence of discrete random 
variables with distribution 

1 

3° 

a positive constant 0 < Y <2(withY # 1), and \\o|| = o(S4~!) = 2C. Further- 
more, {U;}, {Ej}, {E’} and {V;} are mutually independent. Then 


P(V; = —G) = PV; = M)= 


IVil 


00 -1/Y 
n4>) 2) A EU ;! 715 J vita ter 110.7} (5) 
j=l 
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where 


br = -l(1—Y)c(m*-! — G*!) (16) 


and y is the Euler constant [1, 6.1.3], converges a.s. and uniformly int € [0, T] to 
a CGMY process with parameters (C, G, M, Y, 0). 


This series representation is not new in the literature, see [2] and [12]. Itis a slight 
modification of the series representation of the stable distribution [11], but here big 
jumps are removed. The shot noise representation for the KR distribution follows. 


Proposition 2 Let {U;} and {T;} be i.i.d. sequences of uniform random variables in 
(0, 1) and (0, T) respectively, {Ej} and {E"} i.i.d. sequences of exponential variables 
of parameter 1 and {Vj} = E, +...+ E', and constants a € (0,2) (witha # 1), 
ky, k_,r4,r_ > Oand, px, p— € (—a, oo) \ {—1, 0}. Let {Vj} be ani.i.d. sequence 
of random variables with density 


WO Tey ar (3 fa ee ec ae ey 


where a ie 
+P. ay Aah 
lo || = —* . 
a+ pe at p- 
Furthermore, {U;}, {Ej}, {E%} and {V;} are mutually independent. If a € (0, 1), or 
ifa € (I, 2) withky = k_, ry =r— and ps = p-, then the series 


[o.e) 
X= >I ane EUV!) a + 0b 17 
i) {Tj <t} Fic] A [Vj \ lA + tbr (17) 
i j 


j= 


converges a.s. and uniformly int € [0,T] to a KR tempered stable process with 
parameters (k1, ky, rt, r4, P+, P+, a, 0) with 


k kore 
by = =a) ges eee ). 
p+tl p-+1 


Ifa € (1, 2) andks 4 k_ (orry # r— or alternatively p+ # p—), then 


_< aly \~/" EAR Vi 
me <1) a Eu vj) 
=i T\loll IVj\ 


ie ee 
= xo|+tbr, (18) 
T (74) 


converges a.s. and uniformly int € [0,T] to a KR tempered stable process with 
parameters (k1, k_, r+, r-, p+, p—, , 0), where we set 


1 
br =a V4 (<) T~'(T\lo||)'/"x0 —-TA —@)x1 


40 MLL. Bianchi et al. 


with 


= ki rt k_r®% 
=o (Se - ). 
+ _— 


Kary kre 
peti p-+l 


where ¢ denotes the Riemann zeta function [1, 23.2], y is the Euler constant [1, 6.1.3]. 


4.1 A Monte Carlo example 


In this section, we assess the goodness of fit of random number generators proposed in 
the previous section. A brief Monte Carlo study is performed and prices of European 
put options with different strikes are calculated. We take into consideration a CGMY 
process with the same artificial parameters as [16], thatis, C = 0.5,G = 2, M =3.5, 
Y = 0.5, interest rate r = 0.04, initial stock price Sg = 100 and annualised maturity 
T = 0.25. Furthermore we consider also a GTS process defined by the characteristic 
exponent (9) and parameters cy = 0.5, c- = 1,44 = 3.5,A_ = 2 anda = 0:5, 
interest rate r, initial stock price So and maturity T as in the CGMY case. 

Monte Carlo prices are obtained through 50,000 simulations. The Esscher trans- 
form with 6 = —1.5 is considered to reduce the variance [12]. We want to emphasise 
that the Esscher transform is an exponential tilting [21], thus if applied to a CGMY 
or a GTS process, it modifies only parameters but not the form of the characteristic 
function. 

In Table | simulated prices and prices obtained by using the Fourier transform 
method [6] are compared. Even if there is a competitive CGMY random number 
generator, where a time changed Brownian motion is considered [16], we prefer to 
use an algorithm based on series representation. Contrary to the CGMY case, in 


Table 1. European put option prices computed using the Fourier transform method (price) and 
by Monte Carlo simulation (Monte Carlo) 


CGMY GTS 
Strike Price Monte Carlo Strike Price Monte Carlo 
80 1.7444 1.7472 80 3.2170 3.2144 
85 2.3926 2.3955 85 4.2132 4.2179 
90 3.2835 3.2844 90 5.4653 5.4766 
95 4.5366 4.5383 95 7.0318 7.0444 
100 6.3711 6.3724 100 8.9827 8.9968 
105 9.1430 9.1532 105 11.3984 11.4175 
110 12.7632 12.7737 110 14.3580 14.3895 
115 16.8430 16.8551 115 17.8952 17.9394 


120 21.1856 21.2064 120 21.9109 21.9688 
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general there is no constructive method to find the subordinator process that changes 
the time of the Brownian motion; that is we do not know the process 7; such that the 
TS, process X; can be rewritten as W7(r) [7]. The shot noise representation allows 
one to generate any TS, process. 


5 Conclusions 


In this work, we have focused our attention on the practical implementation of nu- 
merical methods involving the use of TS, distributions and processes in the field of 
finance. Basic definitions are given and a possible algorithm to approximate the den- 
sity function is proposed. Furthermore, a general Monte Carlo method is developed 
with a look at option pricing. 
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Transformation kernel estimation of insurance claim 
cost distributions 


Catalina Bolancé, Montserrat Guillén, and Jens Perch Nielsen 


Abstract. A transformation kernel density estimator that is suitable for heavy-tailed distribu- 
tions is discussed. Using a truncated beta transformation, the choice of the bandwidth parameter 
becomes straightforward. An application to insurance data and the calculation of the value-at- 
risk are presented. 


Key words: non-parametric statistics, actuarial loss models, extreme value theory 


1 Introduction 


The severity of claims is measured in monetary units and is usually referred to as 
insurance loss or claim cost amount. The probability density function of claim amounts 
is usually right skewed, showing a big bulk of small claims and some relatively 
infrequent large claims. For an insurance company, density tails are therefore of 
special interest due to their economic magnitude and their influence on re-insurance 
agreements. 

It is widely known that large claims are highly unpredictable while they are re- 
sponsible for financial instability and so, since solvency is a major concern for both 
insurance managers and insurance regulators, there is a need to estimate the density 
of claim cost amounts and to include the extremes in all the analyses. 

This paper is about estimating the density function nonparametrically when 
data are heavy-tailed. Other approaches are based on extremes, a subject that 
has received much attention in the economics literature. Embrechts et al., Coles, 
and Reiss and Thomas [8, 11, 15] have discussed extreme value theory (EVT) 
in general. Chavez-Demoulin and Embrechts [6], based on Chavez-Demoulin and 
Davison [5], have discussed smooth extremal models in insurance. They focused 
on highlighting nonparametric trends, as a time dependence is present in many 
catastrophic risk situations (such as storms or natural disasters) and in the finan- 
cial markets. A recent work by Cooray and Ananda [9] combines the lognormal 
and the Pareto distribution and derives a distribution which has a suitable shape 
for small claims and can handle heavy tails. Others have addressed this subject 
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with the g-and-h distribution, like Dutta and Perry [10] for operation risk anal- 
ysis. The g-and-h distribution [12] can be formed by two nonlinear transforma- 
tions of the standard normal distribution and has two parameters, skewness and 
kurtosis. 

In previous papers, we have analysed claim amounts in a one-dimensional setting 
and we have proved that a nonparametric approach that accounts for the asymmetric 
nature of the density is preferred for insurance loss distributions [2, 4]. Moreover, 
we have applied the method to a liability data set and compared the nonparametric 
kernel density estimation procedure to classical methods [4]. Several authors [7] have 
devoted much interest to transformation kernel density estimation, which was initially 
proposed by Wand et al. [21] for asymmetrical variables and based on the shifted 
power transformation family. The original method provides a good approximation 
for heavy-tailed distributions. The statistical properties of the density estimators are 
also valid when estimating the cumulative density function (cdf). Transformation 
kernel estimation turns out to be a suitable approach to estimate quantiles near 1 and 
therefore it can be used to estimate Value-at-Risk (VaR) in financial and insurance- 
related applications. 

Buch-Larsen et al. [4] proposed an alternative transformation based on a gener- 
alisation of the Champernowne distribution; simulation studies have shown that it is 
preferable to other transformation density estimation approaches for distributions that 
are Pareto-like in the tail. In the existing contributions, the choice of the bandwidth 
parameter in transformation kernel density estimation is still a problem. One way of 
undergoing bandwidth choice is to implement the transformation approach so that it 
leads to a beta distribution, then use existing theory to optimise bandwidth parameter 
selection on beta distributed data and backtransform to the original scale. The main 
drawback is that the beta distribution may be very steep in the domain boundary, which 
causes numerical instability when the derivative of the inverse distribution function is 
needed for the backward transformation. In this work we propose to truncate the beta 
distribution and use the truncated version at transformation kernel density estimation. 
The results on the optimal choice of the bandwidth for kernel density estimation of 
beta density are used in the truncated version directly. In the simulation study we 
see that our approach produces very good results for heavy-tailed data. Our results 
are particularly relevant for applications in insurance, where the claims amounts are 
analysed and usually small claims (low cost) coexist with only a few large claims 
(high cost). 

Let fx be adensity function. Terrell and Scott [19] and Terrell [18] analysed several 
density families that minimise functionals [ ee ) wy} dx, where superscript (p) 
refers to the pth derivative of the density function. We will use these families in 
the context of transformed kernel density estimation. The results for those density 


families are very useful to improve the properties of the transformation kernel density 
estimator. 
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Given a sample X1,..., X», of independent and identically distributed (iid) ob- 
servations with density function fx, the classical kernel density estimator is: 


se | eee 
a a (1) 


where b is the bandwidth or smoothing parameter and Kp, (t) = K (t/b) /b is the 
kernel. In Silverman [16] or Wand and Jones [20] one can find an extensive revision 
of classical kernel density estimation. 

An error distance between the estimated density Fe and the theoretical density 
jx that has widely been used in the analysis of the optimal bandwidth b is the mean 
integrated squared error (MISE): 


E| [ (fm) feo) ax]. Q) 


It has been shown (see, for example, Silverman [16], chapter 3) that the MISE is 
asymptotically equivalent to A — MISE: 


1 ¥ 1 
ra a? | (ne (x)}? dx + =, | Korat, (3) 


where ky = df t7K (t) dt. If the second derivative of fx exists (and we denote it 


y fx), then f { f”’ (x)}? dx is a measure of the degree of smoothness because the 
smoother the density, the smaller this integral is. From the expression for A— MISE 
it follows that the smoother fx, the smaller the value of A— MISE. 

Terrell and Scott (1985, Lemma 1) showed that Beta (3, 3) defined on the domain 
(—1/2, 1/2) minimises the functional [ {fy (w)} dx within the set of beta densities 
with the same support. The Beta (3, 3) distribution will be used throughout our work. 
Its pdf and cdf are: 


15 \2 1 1 
ga) = = (14x?) pS wea (4) 
1 2 3 
G(x) = 5 (49x + 6x") (1 +22) (5) 
We assume that a transformation exists so that T (X;) = Z; (i = 1,...,7) is 


assumed from a Uniform(0, 1) distribution. We can again transform the data so 
that G~! (Z;) = Y; is arandom sample from a random variable y with a Beta(3, 3) 
distribution, whose pdf and cdf are defined respectively in (4) and (5). 

In this work, we use a parametric transformation 7 (-), namely the modified Cham- 
pernowne cdf, as proposed by Buch-Larsen et al. [4]. 

Let us define the kernel estimator of the density function for the transformed 
variable: 


~ , lex . 
SO = 72 Oat (6) 
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which should be as close as possible to a Beta(3, 3). We can obtain an exact 
value for the bandwidth parameter that minimizes A — MISE of g. If K(t) = 
(3/4) (1 - 1”) 1 ({t| < 1) is the Epanechnikov kernel, where 1 (-) equals one when 
the condition is true and zero otherwise, then we show that the optimal smoothing 
parameter for g if y follows a Beta(3, 3) is: 


=) 1 
= (=) : (=). (720)-3 n7-5. (7) 


Finally, in order to estimate the density function of the original variable, since 
y = G7! (z) = G! {T («)}, the transformation kernel density estimator is: 


fe (x) = 8) [G77 (} T' (x) = 
= 7K (c"! (T@)-Gr (xi)}) [e"! (To)] T' (x). (8) 


The estimator in (8) asymptotically minimises M/S E and the properties of the trans- 
formation kernel density estimation (8) are studied in Bolancé et al. [3]. Since we 
want to avoid the difficulties of the estimator defined in (8), we will construct the 
transformation so as to avoid the extreme values of the beta distribution domain. 


2 Estimation procedure 


Let z = T (x) be a Uniform(0, 1); we define a new random variable in the interval 
{1 — 1,1], where 1/2 <1 < 1. The values for / should be close to 1. The new random 
variable is z* = T* (x) = (1 —/) + 7 —1)T (x). We will discuss the value of / 
later. 

The pdf of the new variable y* = G~! (z*) is proportional to the Beta(3, 3) pdf, 
but it is in the [—a, a] interval, where a = G™! (J). Finally, our proposed transfor- 
mation kernel density estimation is: 


#0") [G71 {7* @}] TT” @) 
(21 — 1) 


AG) = 8 (y*) [e"! (7? «)}] T’ (x) 


= ps Kp (a! a (x)} = G7! {T (xi)}) ico {T* ()}] qT! (x). (9) 


The value of A — MISE associated to the kernel estimation g (y*), where the 
random variable y* is defined on an interval that is smaller than Beta(3, 3) domain 
iS: 


A=MISE,= oy [ {e" oP ay+ > | eoydy [ Kw? ar. U0) 
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And finally, the optimal bandwidth parameter based on the asymptotic mean integrated 
squared error measure for g (y*) is: 


: 1 


25 1 a 3 a ret 1 

bert = ky UL rar [” evra) ( ie {3” ©} ay) ns 
FAVS (371 : , 5 
= (=) (3 (40 (—40a + 48a" + i5))) 


1 
x (3604 (—400 + 14404 + 5)) Bee, (11) 
The difficulty that arises when implementing the transformation kernel estimation 
expressed in (9) is the selection of the value of /. This value can be chosen subjectively 
as discussed in the simulation results by Bolancé et al. [3]. Let X;,i = 1,...,n, be 
iid observations from a random variable with an unknown density fx. The transforma- 
tion kernel density estimator of fx is called KIBMCE (kernel inverse beta modified 
Champernowne estimator). 


3 VaR estimation 


In finance and insurance, the VaR represents the magnitude of extreme events and 
therefore it is used as a risk measure, but VaR is a quantile. Let x be a loss random 
variable with distribution function Fx; given a probability level p, the VaR of x 
is VaR (x, p) = inf {x, Fx (x) > p}. Since Fx is a continuous and nondecreasing 
function, then VaR (x, p) = F,! (p), where p is a probability near 1 (0.95, 0.99....). 
One way of approximating VaR (x, p) is based on the empirical distribution function, 
but this has often been criticised because the empirical estimation is based only on 
a limited number of observations, and even np may not be an integer number. As 
an alternative to the empirical distribution approach, classical kernel estimation of 
the distribution function can be useful, but this method will be very imprecise for 
asymmetrical or heavy-tailed variables. 

Swanepoel and Van Graan [17] propose to use a nonparametric transformation of 
the data, which is equal to a classical kernel estimation of the distribution function. 
We propose to use a parametric transformation based on a distribution function. 

Given a transformation function Tr (x), it follows that Fx (x) = Fry x) (Tr (x)). 
So, the computation of VaR (x, p) is based on the kernel estimation of the distribution 
function of the transformed variable. 

Kernel estimation of the distribution function is [1, 14]: 


1 n Tr(~)-Tr(Xj) 

iat b 

Frr(x) (Tr (x)) = ~Df, K (t) dt. (12) 
i=1° 


Therefore, the VaR (x, p) can be found as: 


VaR (x, p) = Tr! [VaR (Tr (x), P= TI" [Frwy (P)]. (13) 


48 C. Bolancé, M. Guillén, and J.P. Nielsen 
4 Simulation study 


This section presents a comparison of our inverse beta transformation method with the 
results presented by Buch-Larsen et al. [4] based only on the modified Champernowne 
distribution. Our objective is to show that the second transformation, which is based 
on the inverse of a beta distribution, improves density estimation. 

In this work we analyse the same simulated samples as in Buch-Larsen et al. [4], 
which were drawn from four distributions with different tails and different shapes near 
0: lognormal, lognormal-Pareto, Weibull and truncated logistic. The distributions and 
the chosen parameters are listed in Table 1. 


Table 1. Distributions in simulation study 


Distribution Density Parameters 
; _ (logx—)* 
—— 202 = 
Lognormal(, 7) f(x) pe e (u, 7) = (0,0.5) 
Weibull(y ) f(x) = yx De” y =15 
1 _ dog x—p)? (p, L,o,A, P> c) 
Mixture of pLognormal(, @) f@)= P Tinoex e = (0.7, 0, 1,1, 1,-1) 
at . Z ~— cy) +) 9p 
and (1 — p)Pareto(/, p,c) + (1 — p)(« -c) pape (03,041.41) 
x x —2 
Tr. Logistic f@= 2e5 (1 + e*) s=l1 


Buch-Larsen et al. [4] evaluate the performance of the KMCE estimators com- 
pared to the estimator described by Clements et al. [7], the estimator described by 
Wand et al. [21] and the estimator described by Bolancé et al. [2]. The Champer- 
nowne transformation substantially improves the results from previous authors. Here 
we see that if the second transformation based on the inverse beta transformation 
improves the results presented in Buch-Larsen et al. [4], this means that the double- 
transformation method presented here is a substantial gain with respect to existing 
methods. 

We measure the performance of the estimators by the error measures based on L 
norm, Lz norm and W/ SE. The last one weighs the distance between the estimated 
and the true distribution with the squared value of x. This results in an error measure 
that emphasises the tail of the distribution, which is very relevant in practice when 
dealing with income or cost data: 


oo 1/2 
i (Fa) - sf)? x2dx\ (14) 
0 


The simulation results can be found in Table 2. For every simulated density and 
for sample sizes N = 100 and N = 1000, the results presented here correspond to 
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Table 2. Estimated error measures (L;, Lz and WISE) for KMCE and KIBMCE / = 0.99 
and / = 0.98 for sample size 100 and 1000 based on 2000 replications 


Lognormal Log-Pareto Weibull Tr. Logistic 
p=0.7 p=0.3 


N=100 L1 KMCE 0.1363 0.1287 0.1236 0.1393 0.1294 
KIBMCE /=0.99 0.1335 0.1266 0.1240 0.1374 = 0.1241 

1=0.98 0.1289 0.1215 0.1191 0.1326 0.1202 

L2  KMCE 0.1047 0.0837 0.0837 0.1084 0.0786 

KIBMCE /=0.99 0.0981 0.0875 0.0902 0.1085 0.0746 

1=0.98 0.0956 0.0828 0.0844 0.1033 0.0712 

WISE KMCE 0.1047 0.0859 0.0958 0.0886 0.0977 
KIBMCE /=0.99 0.0972 0.0843 0.0929 0.0853 0.0955 

1=0.98 0.0948 0.0811 0.0909 0.0832 0.0923 

N =1000 LI KMCE 0.0659 0.0530 0.0507 0.0700 0.0598 
KIBMCE /=0.99 0.0544 0.0509 0.0491 0.0568 0.0497 

1=0.98 0.0550 0.0509 0.0522 0.0574 0.0524 

L2  KMCE 0.0481 0.0389 0.0393 0.0582 0.0339 

KIBMCE /=0.99 0.0394 0.0382 0.0393 0.0466 0.0298 

1=0.98 0.0408 0.0385 0.0432 0.0463 0.0335 

WISE KMCE 0.0481 0.0384 0.0417 0.0450 0.0501 
KIBMCE /=0.99 0.0393 ~—- 0.0380 0.0407 0.0358 0.0393 

1=0.98 0.0407 0.0384 0.0459 0.0369 0.0394 


the following error measures: L;, L2 and WISE for different values of the trim- 
ming parameter / = 0.99, 0.98. The benchmark results are labelled KMCE and they 
correspond to those presented in Buch-Larsen et al. [4]. 

In general, we can conclude that after a second transformation based on the inverse 
of a modified beta distribution cdf, the error measures diminish with respect to the 
KMCE method. In some situations the errors diminish quite substantially with respect 
to the existing approaches. 

We can see that the error measure that shows improvements when using the 
KIBMCE estimator is the WISE, which means that this new approach fits the tail 
of positive distributions better than existing alternatives. The W/ SE error measure 
is always smaller for the KIBMCE than for the KMCE, at least for one of the two 
possible values of / that have been used in this simulation study. This would make the 
KIBMCE estimator specially suitable for positive heavy-tailed distributions. When 
looking more closely at the results for the mixture of a lognormal distribution and a 
Pareto tail, we see that larger values of / are needed to improve the error measures 
that were encountered with the KMCE method only for N = 1000. For N = 100, a 
contrasting conclusion follows. 

We can see that for the truncated logistic distribution, the lognormal distribution 
and the Weibull distribution, the method presented here is clearly better than the 
existing KMCE. We can see in Table 2 that for N = 1000, the KIBMCE WISE is 
about 20 % lower than the KMCE W/SE for these distributions. A similar behaviour 
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is shown by the other error measures, L; and L2, which for N = 1000, are about 
15 % lower for the KIBMCE. 

Note that the KMCE method was studied in [4] and the simulation study showed 
that it improved on the error measures for the existing methodological approaches [7, 
21). 


5 Data study 


In this section, we apply our estimation method to a data set that contains automobile 
claim costs from a Spanish insurance company for accidents that occurred in 1997. 
This data set was analysed in detail by Bolancé et al. [2]. It is a typical insurance 
claims amount data set, i.e., a large sample that looks heavy-tailed. The data are 
divided into two age groups: claims from policyholders who are less than 30 years old 
and claims from policyholders who are 30 years old or older. The first group consists 
of 1061 observations in the interval [1;126,000] with mean value 402.70. The second 
group contains 4061 observations in the interval [1;17,000] with mean value 243.09. 
Estimation of the parameters in the modified Champernowne distribution function 
for the two samples is, for young drivers @, = 1.116, M, = 66, G4 = 0.000 and for 
older drivers @ = 1.145, M> = 68, C2 = 0.000. We notice that a} < a2, which 
indicates that the data set for young drivers has a heavier tail than the data set for 
older drivers. 

For small costs, the KIBMCE density in the density peak is greater than for 
the KMCE approach proposed by Buch-Larsen et al. [4] both for young and older 
drivers. For both methods, the tail in the estimated density of young policyholders is 
heavier than the tail of the estimated density of older policyholders. This can be taken 
as evidence that young drivers are more likely to claim a large amount than older 
drivers. The KIBMCE method produces lighter tails than the KMCE methods. Based 
on the results in the simulation study presented in Bolancé et al. [3], we believe that 
the KIBMCE method improves the estimation of the density in the extreme claims 
class. 


Table 3. Estimation of VaR at the 95% level, in thousands 
KIBMCE 
Empirical KMCE 1=0.99 1=0.98 


Young 1104 2912 1601 1716 
Older 1000 1827 1119 1146 


Table 3 presents the VaR at the 95% level, which is obtained from the empirical 
distribution estimation and the computations obtained with the KMCE and KIBMCE. 
We believe that the KIBMCE provides an adequate estimation of the VaR and it seems 
a recommendable approach to be used in practice. 
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What do distortion risk measures tell us on excess of 
loss reinsurance with reinstatements? 


Antonella Campana and Paola Ferretti 


Abstract. In this paper we focus our attention on the study of an excess of loss reinsurance 
with reinstatements, a problem previously studied by Sundt and, more recently, by Mata and 
Hurlimann. It is well known that the evaluation of pure premiums requires knowledge of the 
claim size distribution of the insurance risk: in order to face this question, different approaches 
have been followed in the actuarial literature. In a situation of incomplete information in which 
only some characteristics of the involved elements are known, it appears to be particularly 
interesting to set this problem in the framework of risk-adjusted premiums. It is shown that if 
risk-adjusted premiums satisfy a generalised expected value equation, then the initial premium 
exhibits some regularity properties as a function of the percentages of reinstatement. 


Key words: excess of loss reinsurance, reinstatements, distortion risk measures 


1 Introduction 


In recent years the study of excess of loss reinsurance with reinstatements has become 
a major topic, in particular with reference to the classical evaluation of pure premiums, 
which is based on the collective model of risk theory. 

The problem, previously studied by Sundt [5] and, more recently, by Mata [4] 
and Hurlimann [3], requires the evaluation of pure premiums given the knowled- 
ge of the claim size distribution of the insurance risk: in order to face this question, 
different approaches have been followed in the actuarial literature. Sundt [5] based the 
computation on the Panjer recursion numerical method and Hurlimann [3] provided 
distribution-free approximations to pure premiums. 

In a situation of incomplete information in which only some characteristics of 
the involved elements are known, it appears to be particularly interesting to set this 
problem in the framework of risk-adjusted premiums. 

We start from the methodology developed by Sundt [5] to price excess of loss rein- 
surance with reinstatements for pure premiums and, with the aim of relaxing the basic 
hypothesis made by Walhin and Paris [6], who calculated the initial premium P under 
the Proportional Hazard transform premium principle, we address our analysis to the 
study of the role played by risk-adjusted premium principles. The particular choice 
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in the proposal of Walhin and Paris of the PH-transform risk measure strengthens our 
interest in the study of risk-adjusted premiums that belong to the class of distortion 
risk measures defined by Wang [7]. 

In the mathematical model we studied (for more details see Campana [1]), when 
the reinstatements are paid (0 < c; < 1 is the ith percentage of reinstatement) 
the total premium income 6(P) becomes a random variable which is correlated to 
the aggregate claims S. Since risk measures satisfy the properties of linearity and 
additivity for comonotonic risks (see [2]) and layers are comonotonic risks, we can 
define the function 


K-1 
F(P,c1,02,...,¢K) = P [ + - SS cit Wey (Lx (im, @ + hm) - 
i=0 
: () 
— DE Wes(Lx im, @ + Im) 
i=0 
where gj and go are distortion functions and W,(X’) denotes the distortion risk mea- 
sure of X. This function gives a measure of the distance between two distortion risk 
measures: that of the total premium income 6(P) and that of the aggregate claims S. 
The choice of risk-adjusted premiums satisfying the expected value equation ensures 
that the previous distance is null: in this way, it is possible to study the initial premium 
P as a function of the percentages of reinstatement. 

The paper is organised as follows. In Section 2 we first review some basic settings 
for describing the excess of loss reinsurance model and we review some definitions 
and preliminary results in the field of non-proportional reinsurance covers. Section 3 
is devoted to the problem of detecting the total initial premium: we present the study 
of the case in which the reinstatements are paid in order to consider the total premium 
income as a random variable which is correlated to the aggregate claims. The analysis 
is set in the framework of distortion risk measures: some basic definitions and results 
in this field are recalled. Section 4 presents the main results related to the problem of 
measuring the total initial premium as a function of the percentages of reinstatement, 
dependence that it is generally neglected in the literature. Some concluding remarks 
in Section 5 end the paper. 


2 Excess of loss reinsurance with reinstatements: problem setting 


The excess of loss reinsurance model we study in this paper is related to the model 
that has been proposed and analysed by Sundt [5]. Some notations, abbreviations and 
conventions used throughout the paper are the following. 

An excess of loss reinsurance for the layer m in excess of d, written m xs d, is 
a reinsurance which covers the part of each claim that exceeds the deductible d but 
with a limit on the payment of each claim, which is set equal to m; in other words, 
the reinsurer covers for each claim of size Y the amount 


Ly(d,d +m) = min{(Y — d)4, m} 


where (a)+ =a ifa > 0, otherwise (a)+ = 0. 
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We consider an insurance portfolio: N is the number of claims that occurred in 
the portfolio during the reference year and Y; is the ith claim size (i = 1,2,..., N). 
The aggregate claims to the layer is the random sum given by 


N 
X= DS Ly,(d,d +m). 
i=1 
It is assumed that X = 0 when N = O. An excess of loss reinsurance, or for short 
an XL reinsurance, for the layer m xs d with aggregate deductible D and aggregate 
limit M covers only the part of X that exceeds D but with a limit M: 


Lx(D, D + M) = min{(X — D);, M}. 


This cover is called an XL reinsurance for the layer m xs d with aggregate layer 
M xs D. 

Generally it is assumed that the aggregate limit M is given as a whole multiple of 
the limit m, i.e., M = (K +1)m: in this case we say that there is a limit to the number 
of losses covered by the reinsurer. This reinsurance cover is called an XL reinsurance 
for the layer m xs d with aggregate deductible D and K reinstatements and provides 
total cover for the following amount 


Lx(D, D+ (K + 1)m) = min{(X — D),, (K + 1m}. (2) 


Let P be the initial premium: it covers the original layer, that is 


Lx(D, D +m) = min{(X — D),, m}. (3) 


It can be considered as the 0-th reinstatement. 
The condition that the reinstatement is paid pro rata means that the premium for 
the ith reinstatement is a random variable given by 


cP ? ? 
—Lx(D+ (i — 1)m, D+ im) (4) 
m 


where 0 < cj < 1 is the ith percentage of reinstatement. If c; = 0 the reinstatement 
is free, otherwise it is paid. 

The related total premium income is arandom variable, say 6(P), which is defined 
as 


K-1 
1 
160) = (1+ 2 austx(D im D +64 Dm), (5) 
me i=0 
From the point of view of the reinsurer, the aggregate claims S paid by the reinsurer 
for this XL reinsurance treaty, namely 


S=Lx(D,D+(K +1)m) (6) 
satisfy the relation 
K 
S=>°Lx(D+im, D+ (+ Vm). (7) 


i=0 
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3 Initial premium, aggregate claims and distortion risk measures 


The total premium income 6(P) is a random variable which is correlated to the 
aggregate claims S' in the case in which the reinstatements are paid. Then it follows 
that it is not obvious how to calculate the initial premium P. 

Despite its importance in practice, only recently have some Authors moved their 
attention to the study of techniques to calculate the initial premium. More precisely, 
Sundt [5] proposed the methodology to calculate the initial premium P under pure 
premiums and premiums loaded by the standard deviation principle. 

Looking at the pure premium principle for which the expected total premium 
income should be equal to the expected aggregate claims payments 


E[o(P)] = ELS], (8) 


it is quite natural to consider the case in which premium principles belong on more 
general classes: with the aim of plugging this gap, we focus our attention on the class 
of distortion risk measures. Our interest is supported by Walhin and Paris [6], who 
calculated the initial premium P under the Proportional Hazard transform premium 
principle. Even if their analysis is conducted by a numerical recursion, the choice 
of the PH-transform risk measure as a particular concave distortion risk measure 
strengthens our interest. 

Furthermore, in an excess of loss reinsurance with reinstatements the computation 
of premiums requires the knowledge of the claim size distribution of the insurance 
risk: with reference to the expected value equation of the XL reinsurance with rein- 
statements (8), Sundt [5] based the computation on the Panjer recursion numerical 
method and Hurlimann [3] provided distribution-free approximations to pure premi- 
ums. 

Note that both Authors assumed only the case of equal reinstatements, a particular 
hypothesis on basic elements characterising the model. 

In this paper we set our analysis in the framework of distortion risk measures: 
the core of our proposal is represented by the choice of a more general equation 
characterising the excess of loss reinsurance with reinstatements, in such a way that 
it is possible to obtain some general properties satisfied by the initial premium as a 
function of the percentages of reinstatement. In order to present the main results, we 
recall some basic definitions and results. 


3.1 Distortion risk measures 


Arisk measure is defined as a mapping from the set of random variables, namely losses 
or payments, to the set of real numbers. In actuarial science common risk measures 
are premium principles; other risk measures are used for determining provisions and 
capital requirements of an insurer in order to avoid insolvency (see e.g., Dhaene et 
al. [2]). 

In this paper we consider the distortion risk measure introduced by Wang [7]: 


W(X) = ii e(Hy(x))dx (9) 
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where the distortion function g is defined as a non-decreasing function g : [0, 1] — 
[0, 1] such that g(0) = 0 and g(1) = 1. As is well known, the quantile risk measure 
and the Tail Value-at-Risk are examples of risk measures belonging to this class. In 
the particular case of a power g function, i.e., g(x) = x!/?, p > 1, the corresponding 
risk measure is the PH-transform risk measure, which is the choice made by Walhin 
and Paris [6]. 

Distortion risk measures satisfy the following properties (see Wang [7] and Dhaene 
et al. [2]): 


P1. Additivity for comonotonic risks 


W,(S°) = >. W,(Xi) (10) 


i=1 


where S° is the sum of the components of the random vector X° with the same 
marginal distributions of X and with the comonotonic dependence structure. 
P2. Positive homogeneity 


W, (aX) =aW,(X) for any non-negative constant a; (11) 
P3. Translation invariance 
W,(X + b) = W,(X)+ 5 for any constant 5; (12) 


P4. Monotonicity 
W(X) < We (Y) (13) 


for any two random variables X and Y where X < Y with probability 1. 


In the particular case of a concave distortion measure, the related distortion risk 
measure satisfying properties P/-P4 is also sub-additive and it preserves stop-loss 
order. It is well known that examples of concave distortion risk measures are the Tail 
Value-at-Risk and the PH-transform risk measure, whereas quantile risk measure is 
not a concave risk measure. 


4 Risk-adjusted premiums 


In equation (8) the expected total premium income is set equal to the expected aggre- 
gate claims payments: in order to refer to a class of premium principles that is more 
general than the pure premium principle, we consider a new expected value condition 
with reference to the class of distortion risk measures. 

We impose that the distorted expected value of the total premium income 6(P) 
equals the distorted expected value of the aggregate claims S, given two distortion 
functions g; and gz. Note that in our proposal it is possible to consider distortion 
functions that are not necessarily the same. 

The equilibrium condition may be studied as an equation on the initial premium 
P: if it admits a solution which is unique, then we call initial risk-adjusted premium 
the corresponding premium P. This is formalised in the following definition. 
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Definition 1. Let g\ and gz be distortion functions. The initial risk-adjusted premium 
P is the unique initial premium, if it does exist, for which the distorted expected value 
of the total premium income 6(P) equals the distorted expected value of the aggregate 
claims S, that is 

W, (O(P)) = We, (S). (14) 


Equation (14) may be studied from several different perspectives, mostly con- 
cerned with the existence and uniqueness of the solutions. The next result presents a 
set of conditions ensuring a positive answer to both these questions: the choice of an 
excess of loss reinsurance for the layer m xs d with no aggregate deductible D and 
K reinstatements plays the leading role. 


Proposition 1. Given an XL reinsurance with K reinstatements and no aggregate 
deductible and given two distortion functions g, and go, the initial risk-adjusted 
premium P results to be a function of the percentages of reinstatement c\, C2, ..., CK. 
Moreover, it satisfies the following properties: 


i) P isa decreasing function of each percentage of reinstatement c; (i = 1,..., K); 
ii) P is a convex, supermodular, quasiconcave and quasiconvex function of the per- 
centages of reinstatement Cis C2s 006 os ORS 


Proof. Given the equilibrium condition between the distorted expected premium in- 
come and the distorted expected claim payments (14), the initial risk-adjusted pre- 
mium P is well defined: in fact equation (14) admits a solution which is unique. 


Since the layers Ly (im, (i + 1)m), i = 1,2,..., K +1, are comonotonic risks 
from (7) we find 
K 
We» (S) = >) Wey(Lx (im, (i + 1)m)). (15) 
i=0 


From (5), by assuming the absence of an aggregate deductible (i.e. ,D = 0), we have 


1 K-1 
We, ((P)) = (1 Se SD) cit We, (Lx (im, (i + bm)). (16) 


i=0 
Therefore, the initial premium P is well defined and it is given by 


eo We (Lx (im, (i + Dm) 


= . (17) 
14 4 45! cit1 We) (Lx (im, i + Dm) 


The initial risk-adjusted premium P may be considered a function of the percen- 
tages of reinstatement c,,c2,...,cK. Let P= f(c1, c2,--+, cK). 

Clearly the function f is a decreasing function of any percentage of reinstatement 
cj (where i = 1,..., K). 


Moreover, if we set 


K 
A= >) We(Lx(im, (i + 1)m)), 
i=0 
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the gradient vector V f (c) is 


—A We, (Lx (l= 1m, Im)) 


vie) = (2) = {| —A¥e nt) 
g m [! +2 KF! cig We (Exim, (+ 1ym))] 


foreach] = 1,..., K. 
Convexity follows by the strict positivity and concavity of the function 


K-1 


ies ps cit.1 We, (Lx (im, (i + 1)m)). 
[= 


Moreover, the Hessian matrix H ¢(c) of the function f is given by 


a f ) _ {| 2A We Lx (0 — Dn, Im) We (Ex = Im, nm) 
m? [1 +2 DAD! cit1 We, (Lx (im, @ + tym} 
for each /,n = 1,..., K. More compactly it can be expressed as 
Hp (c) = (We (Lx(@ — 1m, Im)) We, (Lx ((n — 1m, nm))) B 


foreach/,n=1,..., K, where 


2A 


B = (Soe eae = . =F os * €.. p80 > =e 
m2 [1+ b DKS! cit Wei (Lx lio, @ + 1m] 
Clearly, H(c) is non-negative definite. 

Given that any cross-partial derivative of the matrix H(c) is non-negative, the 
function g is supermodular. 

Finally, the initial risk-adjusted premium P is a quasiconcave and quasiconvex 
function of the percentages of reinstatement c), c2,..., cK because it is a ratio of 
affine functions. O 


Remark 1. Note that the regularity properties exhibited by the initial risk-adjusted 
premium P are not influenced by functional relations between the two distortion 
functions g; and g2. Moreover, any hypothesis on concavity/convexity of distortion 
risk measures may be omitted because they are unnecessary to prove the smooth shape 
of the initial premium P as a function of cj, c2,..., CK. 


Remark 2. The reinsurance companies often assess treaties under the assumption that 
there are only total losses. This happens, for example, when they use the rate on line 
method to price catastrophe reinsurance. Then it follows that the aggregate claims are 
generated by a discrete distribution and we have (for more details see Campana [1]) 


K 
m > )j~9 82(Pi+1) 


P= f(c1,02,°-:,¢x) = ——wo—— 
[see ci41 81(pi41) 


(18) 


60 A. Campana and P. Ferretti 


where the premium for the ith reinstatement (4) is a two-point random variable dis- 
tributed as c; P By, and By, denotes a Bernoulli random variable such that 


Pr(Byp,; = 1) = pi = 1— Pr[Bp,; = 9]. 


Then 
(a) _ : i 
vio = (Ze) = | eee 2 — sip 
: [1 SED ar C4181 (pin) | 
and 
fog 2. = 7 
Hj() = (So) = | Ree re gigi) 


[! + Sy! ci4igi (pi+1) 


foreach/,n=1,..., K. 


5 Conclusions 


In actuarial literature excess of loss reinsurance with reinstatement has been essen- 
tially studied in the framework of collective model of risk theory for which the classical 
evaluation of pure premiums requires knowledge of the claim size distribution. Gener- 
ally, in practice, there is incomplete information: few characteristics of the aggregate 
claims can be computed. In this situation, interest in general properties characterising 
the involved premiums is flourishing. 

Setting this problem in the framework of risk-adjusted premiums, it is shown 
that if risk-adjusted premiums satisfy a generalised expected value equation, then the 
initial premium exhibits some regularity properties as a function of the percentages 
of reinstatement. In this way it is possible to relax the particular choice made by 
Walhin and Paris [6] of the PH-transform risk measure and to extend the analysis of 
excess of loss reinsurance with reinstatements to cover the case of not necessarily 
equal reinstatements. 

The obtained results suggest that further research may be addressed to the analysis 
of optimal premium plans. 


Acknowledgement. We are grateful to some MAF2008 conference members for valuable com- 
ments and suggestions on an earlier version of the paper. 
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Some classes of multivariate risk measures 


Marta Cardin and Elisa Pagani 


Abstract. In actuarial literature the properties of risk measures or insurance premium prin- 
ciples have been extensively studied. We propose a new kind of stop-loss transform and a 
related order in the multivariate setting and some equivalent conditions. In our work there is 
a characterisation of some particular classes of multivariate and bivariate risk measures and a 
new representation result in a multivariate framework. 


Key words: risk measures, distortion function, concordance measures, stochastic orders 


1 Introduction 


In actuarial sciences it is fairly common to compare two random variables that are 
risks by stochastic orderings defined using inequalities on expectations of the random 
variables transformed by measurable functions. By characterising the considered set 
of functions some particular stochastic orderings may be obtained such as stochastic 
dominance or stop-loss order. These stochastic order relations of integral form may 
be extended to cover also the case of random vectors. 

The main contribution of this paper concerns the construction of a mathematical 
framework for the representation of some classes of multivariate risk measures; in 
particular we study the extension to the multivariate case of distorted risk measures 
and we propose anew kind of vector risk measure. Moreover, we introduce the product 
stop-loss transform of arandom vector to derive a multivariate product stop-loss order. 


2 Multivariate case 


We consider only non-negative random vectors. Let Q be the space of the states 
of nature, F be the o-field and P be the probability measure on F. Our random 
vector is the function X : Q — R" such that X (@) represents the payoff obtained 
if state w occurs. We also specify some notations: F* (x) : R” — [0, 1] is the 
distribution function of X, S* (x) : R” [0, 1] is its survival or tail function, and 
(X (@) — a), = max (X (a) — a, 0) componentwise. 


M. Corazza et al. (eds.), Mathematical and Statistical Methods for Actuarial Sciences and Finance 
© Springer-Verlag Italia 2010 


64 M. Cardin and E. Pagani 


A risk measure, or a premium principle, is the functional R : ¥ > R, where Y 
is a set of non-negative random vectors and R is the extended real line. 


In what follows we present some desirable properties P for risk measures, that are 
our proposal to generalise the well known properties for the scalar case: 


1. Expectation boundedness: R [X] > E [X,...Xy] VX. 

2. Non-excessive loading: R [X] < sup,,<o {|X1 (@)|,..., |Xn (@)]}. 

3. Translation invariance: R[X +a] = R[X]+a YVX,Va © R”, where a isa 
vector of sure initial amounts and a is the componentwise product of the elements 
of a. 

4. Positive homogeneity of order n: R [cX] = c”R[X] VX,Vc > 0. 

Monotonicity: R [X] < R[Y] VX, YsuchthatX x Yinsome stochastic sense. 

6. Constancy: R [b] = b Vbe R.A special case is R [0] = 0, which is called 
normalisation property. 

7. Subadditivity: R [X+ Y] < R[X]+R[Y] VX, Y, which reflects the idea that 
risk can be reduced by diversification. 

8. Convexity: R[AX+ 1 —A)Y] < AR[X]+ U0 —-A)R[Y], VX,Yandie 
[0, 1]; this property implies diversification effects as subadditivity does. 


Nn 


We recall here also some notations about stochastic orderings for multivariate random 
variables: X <sp Y indicates the usual stochastic dominance, X <yo Y indicates the 
upper orthant order, X x70 Y indicates the lower orthant order, X <c Y indicates 
the concordance order and X xsy Y indicates the supermodular order. For the 
definitions, look them up in, for instance, [5]. 

Let us now characterise another formulation for stop-loss transform in the multi- 
variate setting. 


Definition 1. The product stop-loss transform of a random vector X € & is defined 
by xx (t) = E[(X1-—t1)4...(X%, —t)4] Vee R". 


As in the univariate case, we can use this instrument to derive a multivariate stochastic 
order: 


Definition 2. Let X, Y € X be two random vectors. We say that X precedes Y in the 
multivariate product stop-loss order (X SSL, Y) if it holds: 


mx (t)<ay(t) VteR”. 


It could be interesting to give some extensions to the theory of risk in the multivariate 
case, but sometimes it is not possible and we will be satisfied if the generalisation 
works at least in two dimensions. As is well known, different notions are equivalent 
in the bivariate case for risks with the same univariate marginal distribution [11], but 
this is no longer true for n-variate risks with n > 3 [8]. 
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We now introduce the concept of Fréchet space: R denotes the Fréchet space 
given the margins, that is (F1, F2) is the class of all bivariate distributions with 
given margins F\, F). The lower Fréchet bound X of X is defined by FX(t) := 
max{F}(t;) + F2(t2) — 1,0} and the upper Fréchet bound of X, X, is defined by 
F*X(t) := min;{F;(t;)}, where t = (t), 2) € R? andi = 1, 2. The following theorems 
summon up some known results about stochastic orders. For a more interested reader, 
we cite [5]. 


Theorem 1. Let X, Y be bivariate random variables, where X, Y € R(F}, Fz). Then: 
XxvoY © YrRroX & XkKsmMY & XRcY. 


This result is no longer true when multivariate random variables are considered with 
n> 3. 


Theorem 2. Let X, Y be bivariate random variables in R (F\, F2). The following 
conditions are equivalent: 


i) XXsmY; 
ii) E[f(X)] < ELfWM)] for every increasing supermodular function f ; 


iii) E[ fi (X1) fo(X2)) < Elf (1) f22)] for all increasing functions f\, f2; 
iv) mx (t) < ay (t) Vt eR’. 


3 Multivariate distorted risk measures 


Distorted probabilities have been developed in the theory of risk to consider the 
hypothesis that the original probability is not adequate to describe the distribution 
(for example to protect us against some events). These probabilities generate new 
risk measures, called distorted risk measures, see for instance [4, 12, 13]. 

In this section we try to deepen our knowledge about distorted risk measures in 
the multidimensional case. Something about this topic is discussed in [9], but here 
there is not a representation through complete mathematical results. 

We can define the distortion risk measure in the multivariate case as: 


Definition 3. Given a distortion g, which is a non-decreasing function such that g : 
[0, 1] > [0, 1], with g (0) = 0 and g (1) = 1, a vector distorted risk measure is the 
functional: Rg [X] = nae ens on g ss (x)) dx,...dXp. 


We note that the function g (s* (x)) : R4. > [0, 1] is non-increasing in each com- 
ponent. 


Proposition 1. The properties of the multivariate distorted risk measures are the 
following: P1-P6, and P7, P8& if g is concave. 


Proof. P1 and P2 follow immediately from Definition 3, P3 follows recalling that 
sX+a (t) = §* (t — a), P4 is a consequence of the fact that S*(t) = S* (4), PS 
follows from the relationship between multivariate stochastic orders and P6 is given 
by fol... fo" g (dt ...dt) = by...b1 = b. P7 follows from this definition of 
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concavity: if gis aconcave function, then we have that g (a + c)—g (a) > g(b+c)- 
g(b) with a < b andc > 0. We apply this definition pointwise to S* < SY with 
SX+Y > (0, P8 is obvious from properties P4, PS and P7. 


In the multivariate case the eat FX = 1 — §* does not hold and thus, the 
relation Spo 8 g (S* (x)) dx = Se [1 — ee I dx is not in general true with 
Ff : [0,1] > [0, 1], increasing function. 

Moreover, the duality relationship between the functions f and g does not hold, 
thus, in general, the equation g (x) = 1 — f (1 — x) is not true. Applying the concept 
of distortion of either the survival function or the distribution function, the relationship 
between f and g no longer holds. 

Therefore we can observe the differences in the two different approaches. 


Definition 4. Given a distortion function f : [0,1] — [0, 1], increasing and such 
that f (0) = 0 and f (1) = 1, a vector distorted risk measure is the functional: 


Ry [X]= a =F (FX @) lax. 


Now we have subadditivity with a convex function f and this leads to the convexity 
of the measure R/. 

Remembering that a distortion is a univariate function even when we deal with 
random vectors and multivariate distributions, we can also define vector Values at 
Risk (VaR) and vector Conditional Values at Risk (CVaR), using slight alterations of 
the usual distortions for VaR and CVaR respectively, and composing these with the 
multivariate tail distributions or the distribution functions. 


Definition 5. Let X be a random vector that takes on values in RY Vector VaR is the 


distorted measure VaR[X; p] = as wa en ie g (s* (x)) dx,...dxXy, expressed 
using the distortion 


(s%¢ ) 0 0< S*iQj) <1- pi 
x)) = : 
Lb Lappe say <1 


If we want to give to this formulation a more explicit form we can consider the 
componentwise order for which x > VaR [X; p] stands forx; > VaR[X;; p] Vi = 
1,..., or more lightly x; > VaRy, and we can rewrite the distortion as: 


(5% )) 0 x; > VaRy, 
x = 7 
. 1 O< x; < VaRy, 


to obtain VaR [X; p] = form eee! Idx, ...dx, = VaRx, ... VaRyx,. Ob- 
viously this result suggests that considering a componentwise order is similar to con- 
sidering an independency between the components of the random vector. Actually 
we are considering only the case in which the components are concordant. 
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In the same way we can define the vector Conditional Value at Risk: 


Definition 6. Let X be a random vector with values in R",. Vector CVaR is the distorted 
measure CVaR [X; p] = oe Pee hea g (s* (x)) dx, ...dxXn, expressed using the 
distortion: 

S¥(x) 


s(s*@) =} Ty d—p) 
1 1—p; < S*(x;) <1 


0 < S*i(xj)) < 1—- pj 


A more tractable form is given by: 


— S*@) —> V R 
ee) Soe 
1 


0 <x; < VaRx; 
which allows this formula: 


CVaR[X; p] = 


VaRXxy VaRx, +00 +00 Sx (x) 
| =| Idx ...diy+ [ _ — dx ...dxn = 
0 0 VaRx, vary, [jai C — Pi) 


+00 . i gx (x) 


— dx ...dxp. 
vary, []j=1 Ud — pi) " 


VaR [X: p] + / 


VaRxy 


The second part of the formula is not easy to render explicitly if we do not introduce 
an independence hypothesis. 

If we follow Definition 4 instead of 3 we can introduce a different formulation for 
CVaR, very useful in proving a good result proposed later on. 

The increasing convex function f used in the definition of CVaR is the following: 


0 F*i(x;) < pi O< xj < VaRy; 
».¢ 
f (FX) =] Xoo - 1+ TT = po 
IT=1 G — pi) 
Definition 7. Let X be a random vector that takes on values in R’. and f be an 


increasing function f : [0,1] > [0, 1], such that f (0) = Oand f (1) = 1 and defined 
as above. The Conditional Value at Risk distorted by such an f is the following: 


VaRx, VaRx, +oo +00 
cvaRIx: pl= | | Idx ..dsy+ [ af 
0 0 VaRx, VaRx, 


FX(x) -14+]][L, d—- pi) 

1 = | dx... dx, = 
Wye: ( — pi) Si . 

ia pri - Pelle (= pile 

0 Jo [jez1 GQ - pi) 


F*i(x;) > pi xi = VaRy; 


dx,...dXn. 
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We recall here that if 0 < x; < VaRx,, or FXi < pi, then FX < min; {pi}, while 
if xj > VaRx, or F*' > p; then F® > max; {>77_, pi — (n— 1), O}. Only in 
the bivariate case do we then know that SX = FX — 1 + $*! + §*2. Therefore if 
FX < min; {pi}, also FX <1- [10 — pi). This lets us consider the bounds for 
the joint distribution, not just for the marginals. Finally we can present an interesting 
result regarding the representation of subadditive distorted risk measures through 
convex combinations of Conditional Values at Risk. 


Theorem 3. Let X € ¥. Consider a subadditive multivariate distortion in the form 
Ry [X] = Ire {l-f (F* (x)) ]dx. Then there exists a probability measure u on 
a7 


[0, 1] such that: R ¢ [X] = i CVaR [X; p]du(p). 


Proof. The multivariate distorted measure R ¢ [X] = Se [l-f (F x (x)) ]dx is sub- 
additive if f is a convex, increasing function such that: f : [0,1] — [0, 1] with 
f() = Oand f(1) = 1. Let p = 1 — []_, (1 — pi), then a probability measure 
L(p) exists such that this function f can be represented as: f (u) = fe oat P+ q LU (p) 
with p ¢€ [0, 1]. Then, VX € ¥, we can write 


Ry [X]= I. =F (F¥ @) lax 


(FX (x) -— 14+], 
=f.u - PO ea Pe I= ( didn (pdx 


, (1 — pi) 
(F¥ (x) -—1+]T10- pi), 
=| d FO A= Ps, 
is “fi id=») Idu (Pp) 
1 (F* (x) — 14+ T]Tiz1 A - pi), 
=| d 1) = ig 
i wo) fk Td - pp ldx 
1 
= | cvartX pldu(p). 0 


Since not every result about stochastic dominance works in a multivariate setting, we 
restrict our attention to the bivariate one. However, this is interesting because it takes 
into consideration the riskiness not only of the marginal distributions, but also of the 
joint distribution, tracing out a course of action to multivariate generalisations. It is 
worth noting that this procedure has something to do with concordance measures (or 
measures of dependence), which we will describe later on. 

We propose some observations about VaR and CVaR formulated through distor- 
tion functions when X is a random vector with values in Res we have: 


VaR [X; p] = VaRx, VaRx, 
and 


ev _ (x1, x2) 
aR[X; p] = VaRx, VaRx, + _  — dx} dx. 
VaRx, JVaRx, (A — pi) — pa) 
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Under independence hypothesis we can rewrite CVaR in this manner: 


CVaR[X; p] = VaRx, VaRyx, 
1 


S*1 (x1) S*? (x2) dxjdx2. 
*W=ppd-pa pi) — p2) Jvary, ee 


Then we consider 


ie. a 5 (1) $%2 (x9) dxidxp = 
VaRx, VaRx, 


+00 
(1. — pi) (1 — p2)VaRx, VaRx, + ( — pi) VaRx, i) x2d S*? (xz) + 
VaRx, 
+00 +00 +00 
(1 — px) VaRx, i xd S*! (x1) + u mdS®' (x1) | xd S® (2), 
VaRx, VaRx, VaRx, 


which leads, with the first part, to: 


CVaR [X; p] = 2VaRx, VaRx, — VaRy, E[X2|X2 > VaRx,| - 
VaRx,E[Xi|X1 > VaRx,| + E[X2lX2 > VaRx,| E[X1|X1 > VaRx,]. 


4 Measures of concordance 


Concordance between two random variables arises if large values tend to occur with 
large values of the other and small values occur with small values of the other. So 
concordance considers nonlinear associations between random variables that corre- 
lation might miss. Now, we want to consider the main characteristics a measure of 
concordance should have. We restrict our attention to the bivariate case. 

In 1984 Scarsini ( [10]) defined a set of axioms that a bivariate dependence or- 
dering of distributions should have in order that higher ordering means more positive 
concordance. 

By ameasure of concordance we mean a function that attaches to every continuous 
bivariate random vector a real number a(X1, X72) satisfying the following properties: 


—-1 < a(X%, X2) <1; 

a(X1, Xi) = 1; 

a(X1,-X1) =—1; 

a(—X1, X2) = a(X1, —X2) = —a(X1, X2); 

a(X1, X2) = a(X2, X1); 

if X; and X>2 are independent, then a(X1, X2) = 0; 

if (X1, X2) Xc (%1, Y2) then a(X1, X2) < a(%, Y2) 

if {X},, is a sequence of bivariate random vectors converging in distribution to X, 
then lim; oo a@(X,) = a(X). 


OO Se igen! SO 
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Now we consider the dihedral group D4 of the symmetries on the square [0, 1]*. We 
have Da = {e,r,r2,r°,h, hr, hr2, hr?} where e is the identity, h is the reflection 
about x = 5 and r is a 90° counterclockwise rotation. 

A measure uw on [0, 1]? is said to be D4-invariant if its value for any Borel set 
A of [0, 1]? is invariant with respect to the symmetries of the unit square that is 
u(A) = n(d(A)). 


Proposition 2. /f 1 is a bounded D4-invariant measure on [0, 1], there exista, B € R 
such that the function defined by 


pli Xa) =a fF, x0) nF (a1), FG) ~ B 


is a concordance measure. 


Proof. A measure of concordance associated to a continuous bivariate random vector 
depends only on the copula associated to the vector since a measure of concordance 
is invariant under invariant increasing transformation of the random variables. So the 
result follows from Theorem 3.1 of [6]. O 


5 A vector-valued measure 


In Definition 1 we have introduced the concept of product stop-loss transform for 
random vectors. We use this approach to give a definition for a new measure that we 
call Product Stop-loss Premium. 


Definition 8. Consider a non-negative bivariate random vector X and calculate the 
Value at Risk of its single components. Product Stop-loss Premium (PSP) is defined 


as follows: PSP [X; p]= E [(x = VaRy,), (X2 = VaRx,),, | ; 


Of course this definition could be extended also in a general case, writing: 
PSP (X: p|=E [(x: —VaRy,), ...(Xn- VaRx,), | 


but some properties will be different, because not everything stated for the bivariate 
case works in the multivariate one. 

Our aim is to give a multivariate measure that can detect the joint tail risk of the 
distribution. In doing this we also have a representation of the marginal risks and 
thus the result is a measure that describes the joint and marginal risk in a simple and 
intuitive manner. 

We examine in particular the case X; > VaRyx, and X2 > VaRy, simultane- 
ously, since large and small values will tend to be more often associated under the 
distribution that dominates the other one. 

Random variables are concordant if they tend to be all large together or small 
together and in this case we have a measure with non-trivial values when the vari- 
ables exceed given thresholds together and are not constant, otherwise we have 
PSP [X; p] =0. 
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It is clear that concordance affects this measure, and in general we know that 
concordance behaviour influences risk management of large portfolios of insurance 
contracts or financial assets. In these portfolios the main risk is the occurrence of 
many joint default events or simultaneous downside evolutions of prices. 

PSP for multivariate distributions is interpreted as a measure that can keep the 
dependence structure of the components of the random vector considered, when spec- 
ified thresholds are exceeded by each component with probability p;; but indeed it 
is also a measure that can evaluate the joint as well as the marginal risk. In fact, we 
have: 


PSP (X: p|=E [(x: —VaRx,), (X2- VaRx,),, | = 


[- he S* (x) dxjdx2 — VaRx, E[X1|X1 > VaRx, | 
VaRx, VaRx, 
—VaRy, E[X2|X2 > VaRx,| + VaRx, VaRx,. 


Let us denote with CVaR[X; p] the CVaR restricted to the bivariate independent 
case, with X; > VaRy, and X2 > VaRy,, then we have: 


CVaR IX; p] = E[Xi|X1 > VaRx,| E[X21X2 > VaRx,]— 
VaRy, E[X2|X2 > VaRx,] — VaRx, E[X1|X1 > VaRx,| + VaRx, VaR. 
We can conclude that these risk measures are the same for bivariate vectors with 

independent components, on the condition of these restrictions. 


We propose here a way to compare dependence, introducing a stochastic order 
based on our PSP measure. 


Proposition 3. Jf X, Y € R2, thenX sy Y — > PSP[X; p] < PSP[Y; p] Vp 
holds. 


Proof. If X Xsy Y then E[f(X)] < E[Lf(Y)] for every supermodular function f, 
therefore also for the specific supermodular function that defines our PSP and then 
follows PS P[X; p] < PSPL[Y; p]. Conversely if 


PSP[X: p] < PSP[Y; p] and X,YeERo, 


iia rie S* (t) dt < Sf pe SY (t) dt 
VaRx, VaRx, VaRy, Vaky, 


with VaRx, = VaRy, and VaRx, = VaRy,. It follows that S* (t) < S¥ (t), which 
leads to X <c Y. From Theorem | follows X sy Y. O 


we have 


Obviously PSP is also consistent with the concordance order. 

Another discussed property for risk measures is subadditivity; risk measures that 
are subadditive for all possible dependence structures of the vectors do not reflect the 
dependence between (X1 — a1)4 and (X2 — a2),. 
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We can note that our PSP is not always subadditive; in fact, if we take the non- 
negative vectors X, Y € -V, the following relation is not always satisfied: 


E| (X14 ¥i - VaRx, — VaRy,), (X2+ Yo - VaRx, — VaRy,), | < 
E| (Xi -VaRx,), (X2- VaRx,),] + £[(¥i — VaRy,), (¥2- Vary), |. 
After verifying all the possible combinations among scenarios 


X, > VaRx,, X1 < VaRx,, X2> VaRyx,, X2 < VaRx,, 
Y, > VaRy,, Yi < VaRy,, Y2> VaRy,, Y2 < VaRy,, 


we can conclude that the measure is not subadditive when: 
e the sum of the components is concordant and such that: 
Xi + Y; > VaRx, + VaRy, Vi = 1, 2, 


with discordant components of at most one vector; 
e the sum of the components is concordant and such that: 


Xi +Y; > VaRx, + VaRy, Vi = 1, 2, 
with both vectors that have concordant components, but with a different sign: i.e., 
X; >(<)VaRx, and Y; <(>)VaRy, Vi; 
e X;>VaRx, and Y; > VakRy, Vi simultaneously. 


Hence, in these cases, the measure reflects the dependence structure of the vectors 
involved. 


6 Conclusions 


In this paper we have proposed a mathematical framework for the introduction of 
multivariate measures of risk. After considering the main properties a vector measure 
should have, and recalling some stochastic orders, we have outlined our results on 
multivariate risk measures. First of all, we have generalised the theory about dis- 
torted risk measures for the multivariate case, giving a representation result for those 
measures that are subadditive and defining the vector VaR and CVaR. Then, we have 
introduced a new risk measure, called Product Stop-Loss Premium, through its defini- 
tion, its main properties and its relationships with CVaR and measures of concordance. 
This measure lets us also propose a new stochastic order. We can observe that, in the 
literature, there are other attempts to study multivariate risk measures, we cite for 
example [1—3,7] and [9], but they all approach the argument from different points of 
view. Indeed, [9] is the first work that deals with multivariate distorted risk measures, 
but it represents only an outline for further developments, as we have done in the 
present work. 

More recently, the study of risk measures has focused on weakening the definition 
of convenient properties for risk measures, in order to represent the markets in a more 
faithful manner, or on the generalisation of the space that collects the random vectors. 
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Assessing risk perception by means of ordinal models 


Paola Cerchiello, Maria Iannario, and Domenico Piccolo 


Abstract. This paper presents a discrete mixture model as a suitable approach for the anal- 
ysis of data concerning risk perception, when they are expressed by means of ordered scores 
(ratings). The model, which is the result of a personal feeling (risk perception) towards the 
object and an inherent uncertainty in the choice of the ordinal value of responses, reduces the 
collective information, synthesising different risk dimensions related to a preselected domain. 
After a brief introduction to risk management, the presentation of the CUB model and related 
inferential issues, we illustrate a case study concerning risk perception for the workers of a 
printing press factory. 


Key words: risk perception, CUB models, ordinal data 


1 Introduction 


During the past quarter-century, researchers have been intensively studying risk from 
many perspectives. The field of risk analysis has rapidly grown, focusing on issues of 
risk assessment and risk management. The former involves the identification, quan- 
tification and characterisation of threats faced in fields ranging from human health 
to the environment through a variety of daily-life activities (i.e., bank, insurance, IT- 
intensive society, etc.). Meanwhile, risk management focuses on processes of com- 
munication, mitigation and decision making. In normal usage, the notion of “risk” 
has negative connotations and involves involuntary and random aspects. Moreover, 
the conceptual analysis of the risk concept wavers from a purely statistical definition 
(objective) to a notion based on the mind’s representation (subjective). In this con- 
text, perception of risk plays a prominent role in people’s decision processes, in the 
sense that different behaviours depend on distinct risk perception evaluation. Both 
individual and group differences have been shown to be associated with differences 
in perceptions of the relative risk of choice options, rather than with differences in 
attitude towards (perceived) risk, i.e., a tendency to approach or to avoid options 
perceived as riskier [23, 24]. Risk is subjectively defined by individuals and is in- 
fluenced by a wide array of psychological, social, institutional and cultural factors 
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[22]. However, there is no consensus on the relationship between personality and risk 
perception [5]. 

Another fundamental dimension related to the concept of risk deals with the 
dichotomy between experts’ perceptions and those of the common people. The role of 
experts is central in several fields, especially when quantitative data are not sufficient 
for the risk assessment phase (1.e., in operational risk). Typically, experts’ opinions 
are collected via questionnaires on ordinal scales; thereby several models have been 
proposed to elaborate and exploit results: linear aggregation [9], fuzzy methods [2, 
25] and Bayesian approaches [4]. 

Our contribution follows this research path, proposing a class of statistical model 
able to measure the perceptions expressed either by experts or common people. In 
particular we focus on the problem of risk perception related to the workplace with 
regards to injury. Thus, some studies focusing on the relationship between organisa- 
tional factors and risk behaviour in the workplace [21] suggest that the likelihood of 
injuries is affected especially by the following variables: working conditions, occupa- 
tional safety training programmes and safety compliance. Rundmo [20] pointed out 
how the possibility of workplace injuries is linked to the perception of risk frequency 
and exposure. 


2 CUB models: description and inference 


A researcher faced with a large amount of raw data wants to synthesise it in a way 
that preserves essential information without too much distortion. The primary goal of 
statistical modelling is to summarise massive amounts of data within simple structures 
and with few parameters. Thus, it is important to keep in mind the trade-off between 
accuracy and parsimony. In this context we present an innovative data-reduction 
technique by means of statistical models (CUB) able to map different results into 
a parametric space and to model distinct and weighted choices/perceptions of each 
decision-maker. 

CUB models, in fact, are devoted to generate probability structures adequate to 
interpret, fit and forecast the subject’s perceived level of a given “stimulus” (risk, sen- 
sation, opinion, perception, awareness, appreciation, feeling, taste, etc.). All current 
theories of choice under risk or uncertainty assume that people assess the desirability 
and likelihood of possible outcomes of choice alternatives and integrate this informa- 
tion through some type of expectation-based calculus to reach a decision. Instead, the 
approach of CUB models is motivated by a direct investigation of the psychological 
process that generates the human choice [15]. 

Generally, the choices — derived by the perception of risk — are of a qualitative 
(categorical) nature and classical statistical models introduced for continuous phe- 
nomena are neither suitable nor effective. Thus, qualitative and ordinal data require 
specific methods to avoid incongruities and/or loss of efficiency in the analysis of real 
data. With this structure we investigate a probability model that produces interpretable 
results and a good fit. It decodes a discrete random variable (MUB, introduced by 
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D’Elia and Piccolo [8]) and we use CUB models when we relate the responses to sub- 
jects’ covariates. The presence of Uniform and shifted Binomial distributions and the 
introduction of Covariates justify the acronym CUB. This model combines a personal 
feeling (risk awareness) towards the object and an inherent uncertainty in the choice 
of the ordinal value of responses when people are faced with discrete choices. 

The result for interpreting the responses of the raters is a mixture model for ordered 
data in which we assume that the rank r is the realisation of a random variable R that 
is a mixture of Uniform and shifted Binomial random variables (both defined on the 
supportr = 1, 2,...,m), with a probability distribution: 

m—1 r—lem-r 1 
Pr(R=r)=aG,)d-éy € +(d mule, r=1,2,...,m. (1) 
The parameters z € (0, 1] and é é€ [0, 1], and the model is well defined for a 

givenm > 3. 

The risk-as-feelings hypothesis postulates that responses to risky situations (in- 
cluding decision making) result in part from direct (i.e., not correctly mediated) emo- 
tional influences, including feelings such as worry, fear, dread or anxiety. Thus, the 
first component, feeling-risk awareness, is generated by a continuous random vari- 
able whose discretisation is expressed by a shifted Binomial distribution. This choice 
is motivated by the ability of this discrete distribution to cope with several differ- 
ent shapes (skewness, flatness, symmetry, intermediate modes, etc.). Moreover, since 
risk is a continuous latent variable summarised well by a Gaussian distribution, the 
shifted Binomial is a convenient unimodal discrete random variable on the support 
{1,2,..., m}. 

At the same time, feeling states are postulated to respond to factors, such as the 
immediacy of a risk, that do not enter into cognitive evaluations of the risk and also 
respond to probabilities and outcome values in a fashion that is different from the way 
in which these variables enter into cognitive evaluations. Thus, the second compo- 
nent, uncertainty, depends on the specific components/values (knowledge, ignorance, 
personal interest, engagement, time spent to decide) concerning people. As a conse- 
quence, it seems sensible to express it by a discrete Uniform random variable. Of 
course, the mixture (1) allows the perception of any people to be weighted with re- 
spect to this extreme distribution. Indeed, only if z = 0 does a person act as motivated 
by a total uncertainty; instead, in real situation, the quantity (1 — z) measures the 
propensity of each respondent towards the maximal uncertainty. 

An important characterisation of this approach is that we can map a set of ex- 
pressed ratings into an estimated model via (z, €) parameters. Thus, an observed 
complex situation of preferences/choices may be simply related to a single point in 
the parametric space. 

In this context, it is reasonable to assume that the main components of the choice 
mechanism change with the subjects’ characteristics (covariates). Thus, CUB models 
are able to include explanatory variables that are characteristics of subjects and which 
influence the position of different response choices. It is interesting to analyse the 
values of the corresponding parameters conditioned to covariate values. 
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In fact, better solutions are obtained when we introduce covariates for relating 
both feeling and uncertainty to the subject’s characteristic. Generally, covariates im- 
prove the model fitting, discriminate among different sub-populations and are able 
to make more accurate predictions. Moreover, this circumstance should enhance the 
interpretation of parameters’ estimates and the discussion of possible scenarios. 

Following a general paradigm [14, 18], we relate z and € parameters to the 
subjects’ covariates through a logistic function. The chosen mapping is the simplest 
one among the many transformations of real variables into the unit interval and a 
posteriori it provides evidence of ease of interpretation for the problems we will be 
discussing. 

When we introduce covariates into a MUB random variable, we define these 
structures as CUB(p, gq) models characterised by a general parameter vector 9 = 
(x, €) via the logistic mappings: 


1 1 
OND Tur 1B) = ay i=1,2,...,n. (2) 
Here, we denote by » and w; the subject’s covariates for explaining 2; and ¢;, 
respectively. Notice that (2) allows the consideration of models without covariates 
(p = q = 0); moreover, the significant set of covariates may or may not present 
some overlapping [11, 13, 19]. 

Finally, inferential issues for CUB models are tackled by maximum likelihood 
(ML) methods, exploiting the E-M algorithm [16, 17]. The related asymptotic infer- 
ence may be applied using the approximate variance and covariance matrix of the ML 
estimators [14]. This approach has been successfully applied in several fields, espe- 
cially in relation to evaluations of goods and services [6] and other fields of analysis 
such as social analysis [10, 11], medicine [7], sensometric studies [19] and linguistics 
[1]. 

The models we have introduced are able to fit and explain the behaviour of a 
univariate rating variable while we realise that the expression of a complete ranking 
list of m objects/items/services by n subjects should require a multivariate setting. 
Thus, the analysis that will be pursued in this paper should be interpreted as a marginal 
if we studied the rank distributions of a single item without reference to the ranks 
expressed towards the remaining ones. 

Then in the following section, we analyse both the different items and injuries; 
afterwards we propose a complex map that summarises the essential information 
without distortion or inaccuracy. 


3 Assessing risk perception: some empirical evidence 


3.1 Data analysis 


A cross-sectional study was performed in a printing press factory in Northern Italy 
that manufactures catalogues, books and reproductions of artworks. The staff of the 
factory consists of 700 employees (300 office workers and 400 blue-collar workers). 
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The study focused on the blue-collar population of six different departments, each 
dealing with a specific industrial process. The subjects in the cohort are distributed 
among the following six units, whose main activities are also described. In the Plates 
department, workers must set plates and cylinders used during the printing opera- 
tions and then carried out in the Rotogravure and Offset departments. The Packaging 
department is responsible for the bookbinding and packaging operations, while the 
Plants department operates several systems (e.g., electrical and hydraulic) and pro- 
vides services (e.g., storage and waste disposal) that support the production side of the 
company. Lastly, the Maintenance department workers perform a series of operations 
connected with the monitoring and correct functioning of the different equipment of 
the plant. 

With the purpose of studying injury risk perception among company workers, 
a structured “Workplace Risk Perception Questionnaire” was developed. The ques- 
tionnaire asked the respondents to express their opinions on a series of risk factors 
present in their workplace. A 7-point Likert scale was used to elicit the workers’ an- 
swers whose ranges are interpreted as: | = “low perceived risk”; 7 = “high perceived 
risk”. Moreover, we pay particular attention to socio-demographic characteristics 
like ‘gender’ (dichotomous variable ’0’=men and ’ 1’=women) , ‘number of working 
years’ within the company (continuous variable ranging form 1| to 30) and ‘type of 
injury’ (dichotomous variable ’0’= not severe injury and ’1’= severe injury). Finally, 
n = 348 validated questionnaires were collected. 


3.2 Control and measure risk perception: a map 


As already discussed, we built a class of model to evaluate, control and measure the 
risk perception and, means of monitoring activity, to inform the stakeholders of the 
direction of new policies. In this case we show a map of synthesis which contains 
whole information related to different risk dimensions. 

In Figure | we plot for each item the reactions of feeling and uncertainty expressed 
by people. We can observe that the uncertainty is concentrated between 0 and 0.6, a 
range indicating a high level of indecision. The characteristic of feeling, however, is 
extended over the whole parametric space. Both aspects illustrate how the responses 
interact to determine behaviour. Moreover, we deepen some specific aspects of risk- 
related phenomena that are regarded as more interesting. 

In the case of control, for example, we can observe a dichotomous behaviour: 
less sensitivity for injuries such as eye-wound, hit, moving machinery clash (in these 
cases people do not seem to ask for more control), and more for other injuries where 
people, on their scale of risk, consider the aspect of control as a sensible variable for 
improving the conditions of their job. 

Instead, an interesting evaluation is referred to as training, as it is considered an 
important variable of the survey. Less evidence appears for other items shared among 
different levels and whose estimates are spread over the parametric space. 
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Fig. 1. Assessing risk perception: a map of items. 1=Structural Collapse (SC), 2=Short 
Circuit (SH), 3=Moving Machinery Clash (MC), 4=Eye Wound (EW), 5=Collision (CO), 
6=Fire/Explosion (FE), 7=Slipping (SL), 8=Strain (ST), 9=Cut (CU), 10=Hit (HD 


3.3 Perception of fire/explosion risk 


In this context we analyse the degree of danger, a principal item in measuring risk 
perception and we focus on the responses of samples with respect to fire/explosion 
risk. More specifically, we consider the degree of danger that people perceive with 
respect to fire risk and we connect it with some covariates. 

In this kind of analysis, sensible covariates have to be introduced in the model 
by means of a stepwise strategy where a significant increase in the log-likelihoods 
(difference of deviances) is the criterion to compare different models. In order to 
simplify the discussion, we present only the full model and check it with respect to a 
model without covariates. 
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In Table 1 we list the estimation of parameters of a CUB(0,3) with gender (Gen), 
working years (Year) and serious injury (Serinj) as sensible covariates. 


Table 1. Parameters 


Covariates Parameters Estimates (Standard errors) 
Uncertainty a 0.440 (0.060) 
Constant Y0 1.515 (0.319) 
Gender v1 1.029 (0.573) 
Working years y2 —0.032 (0.015) 
Serious injury ¥3 —0.928 (0.323) 


Log-likelihood functions of CUB(0,0) and CUB(0,3) estimated models are 99 = 
—660.07 and £93 = —651.81, respectively. As a consequence, the model with co- 
variates adds remarkable information to the generating mechanism of the data since 
2 * (€93 — €00) = 16.52 is highly significant when compared to the Ye os = 7.815 
percentile with g = 3 degrees of freedom. 

We may express the feeling parameters as a function of covariates in the following 
way: 


1 
ci = 1 + e— 1 515=1.029 Gen; +0.032 Year; +0.928 Serinji ’ 


which synthesises the perception of danger of fire/explosion risk with respect to the 
chosen covariates. More specifically, this perception increases for men and for those 
who have worked for many years (a proxy of experience) and decreases for the part 
of sample that had not suffered from a serious accident. For correct interpretation, it 
must be remembered that if items are scored (as a vote, increasing from | to m as 
liking increases) then (1 — ¢) must be considered as the actual measure of preference 
[10]. Although the value of the response is not metric (as it stems from a qualitative 
judgement), it may be useful for comparative purposes to compute the expected value 
of R, since it is related to the continuous proxy that generates the risk perception. 

More specifically, Figure 2 shows the expectation and its relation to the varying 
working years and for all the profiles of gender and serious injury. 


4 Conclusions 


In this paper, we obtained some results about direct inference on the feeling/risk 
awareness and uncertainty parameters by means of CUB models with and without 
covariates. The experiments confirmed that this new statistical approach gives a dif- 
ferent perspective on the evaluation of psychological processes and mechanisms that 
generate/influence risk perception in people. The results show that CUB models are a 
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Fig. 2. Expected score as a function of working years, given gender and serious injury 


suitable and flexible tool for examining and quantifying the change of response over 
one or more categorical and/or continuous covariates, and they provide a deeper in- 
sight into these kinds of dataset. They also allowed the summary of much information 
and some interesting evaluation of specific points investigated when covariates are 
both absent and present. 

Moreover, we stress that the proposed model is a manifold target approach: in fact, 
it can be profitably applied to a variety of fields, ranging from credit and operational 
risk [3] to reputational and churn risk. Finally, it represents a convincing tool to exploit 
opinions expressed by field experts. 
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A financial analysis of surplus dynamics for 
deferred life schemes* 


Rosa Cocozza, Emilia Di Lorenzo, Albina Orlando, and Marilena Sibillo 


Abstract. The paper investigates the financial dynamics of the surplus evolution in the case 
of deferred life schemes, in order to evaluate both the distributable earnings and the expected 
worst occurence for the portfolio surplus. The evaluation is based on a compact formulation 
of the insurance surplus defined as the difference between accrued assets and present value of 
relevant liabilities. The dynamic analysis is performed by means of Monte Carlo simulations 
in order to provide a year-by-year valuation. The analysis is applied to a deferred life scheme 
exemplar, considering that the selected contract constitutes the basis for many life insurance 
policies and pension plans. The evaluation is put into an asset and liability management deci- 
sion-making context, where the relationships between profits and risks are compared in order 
to evaluate the main features of the whole portfolio. 


Key words: financial risk, solvency, life insurance 


1 Introduction 


The paper investigates the financial dynamics of surplus analysis with the final aim 
of performing a breakdown of the distributable earnings. The question, put into an 
asset and liability management context, is aimed at evaluating and constructing a sort 
of budget of the distributable earnings, given the current information. To this aim, a 
general reconstruction of the whole surplus is performed by means of an analytical 
breakdown already fully developed elsewhere [1], and whose main characteristic is 
the computation of a result of the portfolio, that actuaries would qualify as surplus, 
accountants as income and economists as profit. 

The analysis is developed with the aim of evaluating what share of each year’s 
earnings can be distributed without compromising future results. This share is only a 
sort of minimum level of distributed earnings which can serve as a basis for business 
decisions and that can be easily updated year-by-year as market conditions modify. 
Then the formal model is applied to a life annuity cohort in a stochastic context in 


* Although the paper is the result of a joint study, Sections 1, 2 and 4 are by R. Cocozza, 
whilst Section 3 is by E. Di Lorenzo, A. Orlando and M. Sibillo. 
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order to exemplify the potential of the model. In this paper deferred schemes are 
selected considering that they can be regarded as the basis for many life insurance 
policies and pension plans. Nevertheless, the model can be applied, given the neces- 
sary adjustments, to any kind of contract as well as to non-homogeneous portfolios. 

The rest of the paper is organised as follows. Section 2 introduces the logical 
background of the model itself, while Section 3 detaches the mathematical framework 
and the computational application. Section 4 comments on the numerical results 
obtained and Section 5 concludes. 


2 The model 


As stated [1], the surplus of the policy is identified by the difference between the 
present value of the future net outcomes of the insurer and the (capitalised) flows paid 
by the insureds. This breakdown is evaluated year by year with the intent to compile 
a full prospective account of the surplus dynamics. In the case of plain portfolio 
analysis, the initial surplus is given by the loadings applied to pure premiums; in the 
case of a business line analysis, the initial surplus, set as stated, is boosted by the 
initial capital allocated to the business line or the product portfolio. 

The initial surplus value, in both cases, can be regarded as the proper initial capital 
whose dynamic has to be explored with the aim of setting a general scheme of dis- 
tributable and undistributable earnings. More specifically, given that at the beginning 
of the affair the initial surplus is set as So, the prospective future t-outcomes, defined 
as S;, can be evaluated by means of simulated results to assess worst cases given a 
certain level of probability or a confidence interval. 

The build up of these results, by means of the selected model and of Monte Carlo 
simulations (see Section 3), provides us with a complete set of future outcomes at 
the end of each period t. These values do not depend on the amount of the previous 
distributed earnings. Those results with an occurrence probability lower than the 
threshold value (linked to the selected confidence interval) play the role of worst 
cases scenarios and their average can be regarded as the expected worst occurrence 
corresponding to a certain level of confidence when it is treated as a Conditional 
Value-at-Risk (CVaR). Ultimately, for each period of time, we end up with a complete 
depiction of the surplus by means of a full set of outcomes, defined by both expected 
values and corresponding CVaR. 

The results we obtain for each period can therefore be used as a basis for the 
evaluation of the distributable earnings, with the final aim of assessing the distributable 
surplus share. If the CVaR holds for the expected worst occurrence given a level of 
confidence, its interpretationis pragmatically straightforward: it is the expected worst 
value of the surplus for the selected confidence level. So for any t-period, the CVaR 
estimates the threshold surplus at the confidence level selected and automatically sets 
the maximum distributable earnings of the preceding period. In other words, the CVaR 
of S; can be regarded as the maximum distributable amount of S;_1; at the same time: 


e the ratio of the CVaR of S, to S;_; can be regarded as the distributable share (DS) 
of the t — 1 result; and 


A financial analysis of surplus dynamics for deferred life schemes 87 


e the ratio of the CVaR of S, minus S;_; to the t — | result can be regarded as the 
worst expected Return on Surplus (RoS) for the selected level of confidence. 


Analogous conclusions can be inferred when the analysis is referred to a business line 
and the surplus is enhanced by the allocated capital: the interpretation of the result is 
similar and even clearer, as the last ratio is a proper worst expected return on equity. 

From a methodological point of view, we would like to stress that the analysis 
and, therefore, the simulation procedure could be performed with reference to all 
the risk factors relevant to the time evolution of the portfolio. Many dynamics can 
simultaneously contribute to the differentials that depend on risk factors linked to 
both the assets in which premiums are invested and the value of liabilities for which 
capitalised premiums are deferred. Together with the demographic dynamic, the most 
important factor is the nature of the assets: if these are financial, the risks faced will 
be mainly financial, they will depend directly on the asset type and will not have any 
autonomous relevance. Besides, the crux of the problem is the difference between 
the total rate of return on assets and the rate of interest originally applied in premium 
calculation, so that it can be precisely addressed as investment risk, in order to highlight 
the composite nature of relevant risk drivers. At the same time, other factors can 
contribute to the difference such as the quality of the risk management process, with 
reference to both diversification and risk pooling. This implies that the level of the 
result and its variability is strictly dependent on individual company elements that 
involve both exogenous and endogenous factors. 

Since our focus is on the financial aspect of the analysis, we concentrate in the 
following of the paper only on the question of the investment rate, excluding any 
demographic component and risk evaluation from our analysis. Bearing in mind this 
perspective, the rate actually used as a basis for the simulation procedure has to be 
consistent with the underlying investment and the parameters used to describe the rate 
process have to be consistent with the features of the backing asset portfolio. There- 
fore, once we decide the strategy, the evaluation is calibrated to the expected value 
and estimated variance of the proper return on asset as set by the investment portfo- 
lio. In other words, if we adopt, for instance, a bond strategy the relevant parameters 
will be estimate from the bond market, while if we adopt an equity investment, the 
relevant values will derive from the equity market, and so on, once we have defined 
the composition of the asset portfolio. 


3 Surplus analysis 


3.1 The mathematical framework 


In the following we take into account a stochastic scenario involving the financial and 
the demographic risk components affecting a portfolio of identical policies issued to 
a cohort of c insureds aged x at issue. 

We denote by X;; the stochastic cash flow referred to each contract at time s and by 
Ny the number of claims at time s, {N;} being i.i.d. and multinomial (c, E[1,;]), where 
the random variable 1, takes the value 1 if the insured event occurs, 0 otherwise. 
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The value of the business at time ¢ is expressed by the portfolio surplus S; at 
that time, that is the stochastic difference of the value of the assets and the liabilities 
assessed at time f. In general we can write: 


St = DNs Xyel te, (1) 
Ss 


where 6, is the stochastic force of interest and X; is the difference between premiums 
and benefits at time s. 

Assuming that the random variables N; are mutually independent on the random 
interest 0, and denoting by F; the information flow at time f, 


6; = E[S Fi] = >> eX, El1s]Elel Mt), (2) 


Ss 


Formula (2) can be easily specialised in the case of a portfolio of m-deferred life 
annuities, with annual level premiums P payable at the beginning of each year for a 
period of n years (n < m) and constant annual instalments, R, paid at the end of each 
year, payable if the insured is surviving at the corresponding payment date. It holds: 


6, = E[Si] = D> cXs 5px Elels 2] 3) 


Ss 


where , px denotes the probability that the individual aged x survives at the age x + s 
and 
P if s<n 
X;=34—-R if s>m (4) 


0 if n<s<m. 


As widely explained in the previous section, the surplus analysis provides useful tools 
for the equilibrium appraisal, which can be synthesised by the following rough but 
meaningful and simple relationship: 


Prob(St > 0) =. (5) 


For a deeper understanding of the choice of ¢, refer to [1]. From a more general 
perspective, we can estimate the maximun loss S,, of the surplus at a certain valuation 
time ¢ with a fixed confidence level a, defined as 


Prob(St > Sq) = 4, (6) 


that is: 1 
S, = F-'(1-a), (7) 


F being the cumulative distribution function of St. 

In the following we will take advantage of a simulative procedure to calculate the 
quantile surplus involved in (6), basing our analysis on the portfolio mean surplus at 
time f. 
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We focus on capturing the impact on the financial position at time ¢ — numerically 
represented by the surplus on that date — of the financial uncertainty, which constitutes 
a systematic risk source, and is thus independent of the portfolio size. In fact in this 
case the pooling effect does not have any consequences, in contrast to the effect of 
specific risk sources, as the accidental deviations of mortality. 

Formally the valuation of the mean surplus can be obtained observing that it 
is possible to construct a proxy of the cumulative distribution function of S_ since 


(cf. [4]) 
tg» (2-222) <0 


hence, when the number of policies tends to infinity, S_/c converges in distribution 
to the random variable 


Ns 
— — Efis] 


Cc 


Tr = oe Xs D[Is]e Oydu ; . 


In the case of the portfolio of m-deferred life annuities described above, we set: 


P ifs <n R ifs>m 
xs = >» V= 


—-Rifs>m —Pifs<n 


so, making explicit the surplus’ formalisation, we can write 


7 min(K x, st) min(K x, wt) 
t s 
St = > > xyeds udu __ > yse sh bydu 
i=1 s=0 s=t+l 


where K,, denotes the curtate future lifetime of the th insured aged x at issue and T 
is the contract maturity (T < w— x, w being the ultimate age). We can point out that 
the second term on the right-hand side represents the mathematical provision at time 
t. 

So, remembering the homogeneity assumptions about the portfolio components, 
formula 8 can be specialised as follows: 


min(K x,t) F min(K,,T) 
Tr=Elf >) xsels ete — SY yee Se eV fdyhusol= 9) 
s=0 s=tt+l 
t Ss 
= > es ates | = > yes Px i[e i eae 
s<t s>t 


3.2 The computational application 


As the computational application of the preceding model we consider a portfolio of 
unitary 20-year life annuities with a deferment period of 10 years issued to a group of 
1000 male policyholders aged 40. The portfolio is homogeneous, since it is assumed 
that policyholders have the same risk characteristics and that the future lifetimes 
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are independent and identically distributed. As far as the premiums are concerned, 
we build up the cash flow mapping considering that premiums are paid periodically 
at the beginning of each year of the deferment period. The market premium has a 
global loading percentage of 7% compensating for expenses, safety and profit. Pure 
premiums are computed by applying 2% as the policy rate and by using as lifetables 
the Italian IPS55. 

Since our analysis is focused on the financial aspect, the single local source of 
uncertainty is the spot rate, which is a diffusion process described by a Vasicek model 


dr (t) =k(u —r (t))dt +odW (t),r (0) =", (10) 


where k, 4, o and ro are positive constants and yw is the long-term rate. As informa- 
tive filtration, we use the information set available at time 0. As a consequence, for 
instance, in calculating the flows accrued up to time f, the starting value ro for the 
simulated trajectories is the value known at time 0. Analogously, in discounting the 
flows of the period subsequent to ¢, the starting value of the simulated trajectories is 
E [r;| Fo]. The parameter estimation is based on Euribor-Eonia data with calibration 
set on 11/04/2007 (cf. [2]), since we make the hypothesis that the investment strategy 
is based on a roll-over investment in short-term bonds, as we face an upward term 
structure. The estimated values are uw = 4.10%, o = 0.5% and ro = 3.78%. 

In order to evaluate the Expected Surplus and the CVaR in a simulation framework, 
we consider the Vasicek model to describe the evolution in time of the global rate of 
return on investments earned by the asset portfolio. The a-quantile, qq, of the surplus 
distribution is defined as: 


Prob {S(t) < qa} =1-a. (11) 


In the simulation procedure we set a = 99%. The expected (1 — a) worst case is 
given by the following: 


1 
E [worst cases (1 -—a)]}=(U —- ay! f apap, (12) 
a 
dp being the p-quantile of the surplus distribution. The last equation is then the 
average of the surplus value lower than the a-quantile, qq. 


4 Results 


Recalling Section 2, the simulation results provide us with the expected value of 
the surplus for each period, the first value at time 0 being the portfolio difference 
between the pure premium and the market premium. Therefore, scrolling down the 
table we can very easily see the evolution of the surplus over time together with the 
corresponding CVaR. 

As far as the time evolution is concerned, the surplus shows an increasing trend, 
which is consistent with the positive effect of the financial leverage, since we invest 
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Table 1. Surplus behaviour and related parameters 


Time E (S;) CvaR DS (99%) RoS (99%) 
0 1649,899 
1 1732,851 179,9519 10.91% —89.09% 
2 1820,197 362,4452 20.92% —79.08% 
3 1912,03 556,0326 30.55% —69.45% 
4 2008,6 727,887 38.07% —61.93% 
5 2110,173 890,2828 44.32% —55.68% 
6 2217,03 1051,625 49.84% —50.16% 
7 2329,47 1211,922 55.66% —45.34% 
8 2447,811 1333,802 57,26% —42,74% 
9 2572,389 1409,132 57.57% —42.43% 
10 2703,561 1470,693 55.17% —42.83% 
11 2841,708 1478,472 54.69% —45.31% 
12 2987,235 1416,793 49.86% —50.14% 
13 3140,569 1332,953 44.62% —55.38% 
14 3302,167 1228,486 39.12% —60.88% 
15 3472,516 1109,346 33.59% —66.41% 
16 3652,13 982,2259 28.29% —71.71% 
17 3841,561 840,4843 23.01% —716.99% 
18 4041,392 660,9939 17.21% —82.79% 
19 4252246 443,7866 10.98% —89.02% 


for the whole period of time at a rate which is systematically higher (4 = 4.10% and 
ro = 3.78%) than the premium rate (2%), thus giving rise to a return on assets always 
higher than the average rate of financing. As far as the CVaR is concerned, it shows a 
dynamic which is totally consistent with the mathematical provision time evolution, as 
one can expect as has already been shown elsewhere [3], the financial risk dynamic is 
mainly driven by the mathematical provision time progression. Accordingly, the time 
evolution of the RoS, as defined in Section 2, is directly influenced by the mathematical 
provision and its absolute value, as can easily be seen, is dependent on the confidence 
level chosen. As far as the connection with the reserve dynamic is concerned, we 
can state that both DS and worst expected RoE prove to be fully consistent with the 
traditional and pragmatic idea that the lower the reserve the higher the risk of the 
business. Therefore, distributable earnings can be quantified and managed, through 
this approach, in order to minimise the ruin probability on the basis of both the general 
investment strategy and the specific market condition available at time of issue. 


5 Conclusions and future research prospect 
The sketched model proves to be a way to quantify the amount of distributable earnings 


year-by-year with reference to a specific portfolio of policies as it gives the oppor- 
tunity to build upon a complete distribution budget. Since we concentrate solely on 
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the financial dynamics, the first extension could be the inclusion of a demographic 
component and the modelling of the surplus dynamic by means also of stochastic 
demographic rates, in order to incorporate, where appropriate, the systematic and 
unsystematic components. Another extension could be the evaluation at a whole se- 
ries of critical confidence intervals in order to end up with a double-entry DS table 
where the definition of its levels can be graduated by means of different levels of 
probability, in order to control how the actual distributed amount can influence the 
future performance of the portfolio. 
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Checking financial markets via Benford’s law: the 
S&P 500 case 


Marco Corazza, Andrea Ellero and Alberto Zorzi 


Abstract. In general, in a given financial market, the probability distribution of the first signif- 
icant digit of the prices/returns of the assets listed therein follows Benford’s law, but does not 
necessarily follow this distribution in case of anomalous events. In this paper we investigate the 
empirical probability distribution of the first significant digit of S&P 500’s stock quotations. 
The analysis proceeds along three steps. First, we consider the overall probability distribution 
during the investigation period, obtaining as result that it essentially follows Benford’s law, 
ie., that the market has ordinarily worked. Second, we study the day-by-day probability distri- 
butions. We observe that the majority of such distributions follow Benford’s law and that the 
non-Benford days are generally associated to events such as the Wall Street crash on February 
27, 2007. Finally, we take into account the sequences of consecutive non-Benford days, and 
find that, generally, they are rather short. 


Key words: Benford’s law, S&P 500 stock market, overall analysis, day-by-day analysis, 
consecutive rejection days analysis 


1 Introduction 


It is an established fact that some events, not necessarily of an economic nature, 
have a strong influence on the financial markets in the sense that such events can 
induce anomalous behaviours in the quotations of the majority of the listed assets. 
For instance, this is the case of the Twin Towers attack on September 11, 2001. 
Of course, not all such events are so (tragically) evident. In fact, several times the 
financial markets have been passed through by a mass of anomalous movements which 
are individually not perceptible and whose causes are generally unobservable. 

In this paper we investigate this phenomenon of “anomalous movements in fi- 
nancial markets” in a real stock market, namely the S&P 500, by using the so-called 
Benford’s law. In short (see the next section for more details), Benford’s law is the 
probability distribution associated with the first significant digit! of numbers belong- 
ing to acertain typology of sets. As will be made clear in section 2, it is reasonable to 


! Here significant digit is meant as not the digit zero. 
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guess that the first significant digit of financial prices/returns follows Benford’s law 
in the case of ordinary working of the considered financial markets, and that it does 
not follow such a distribution in anomalous situations. 

The remainder of this paper is organised as follows. In the next section we provide 
a brief introduction to Benford’s law and the intuitions underlying our approach. In 
section 3 we present a short review of its main financial applications. In section 
4 we detail our methodology of investigation and give the results coming from its 
application to the S&P 500 stock market. In the last section we provide some final 
remarks and some cues for future researches. 


2 Benford’s law: an introduction 


Originally, Benford’s law was detected as empirical evidence. In fact, some scientists 
noticed that, for extensive collections of heterogeneous numerical data expressed in 
decimal form, the frequency of numbers which have d as the first significant digit, 
with d = 1, 2,..., 9, was not 1/9 as one would expect, but strictly decreases as d 
increases; it was about 0.301 if d = 1, about 0.176ifd = 2,..., about 0.051 ifd = 8 
and about 0.046 if d = 9. As aconsequence, the frequency of numerical data with the 
first significant digit equal to 1, 2 or 3 appeared to be about 60%. The first observation 
of this phenomenon traces back to Newcomb in 1881 (see [9]), but a more precise 
description of it was given by Benford in 1938 (see [2]). After the investigation of 
a huge quantity of heterogeneous numerical data,” Benford guessed the following 
general formula for the probability that the first significant digit equals d: 


1 
Pr(first significant digit = d) = logy (1 + *) eS Tie 9: 


This formula is now called Benford’s law. 

Only in more recent times the Benford’s law obtained well posed theoretical 
foundations. Likely, the two most common explanations for the emergence of prob- 
ability distributions which follow Benford’s law are linked to scale invariance and 
multiplicative processes (see [11] and [6]).> With attention to the latter explanation — 
which is of interest for our approach — and without going into technical details, Hill 
proved, under fairly general conditions, using random probability measures, that <if 
[probability] distributions are selected at random and random samples are taken 
from each of these distributions, the significant digits of the combined sample will 
converge to Benford distribution> (see [6]). This statement offers the basis for the 
main intuition underlying our paper. In fact, we consider the stocks of the S&P 500 
market as the randomly selected probability distributions, and the prices/returns of 
each of these different assets as the generated random samples. The first significant 


2 For instance, lake surface areas, river lengths, compounds molecular weights, street address 
numbers and so on. 

3 In other studies it has been proved that also powers of [0, 1]-uniform probability distribution 
asymptotically satisfy Benford’s law (see [1] and [7]). 
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digit of such prices/returns should follow the Benford distribution. But, if some ex- 
ceptional event affects these stocks in ways similar among them, the corresponding 
asset prices/returns could be considered “less random” than that stated in [6], and the 
probability distribution of their first significant digit should depart from the Benford 
one. In this sense the fitting, or not, to Benford’s law provides an indication of the 
ordinary working, or not, of the corresponding financial market. 


3 Benford’s law: financial applications 


Investigations similar to ours have been sketched in a short paper by Ley (see [8]), 
which studied daily returns of the Dow Jones Industrial Average (DJIA) Index from 
1900 to 1993 and of the Standard and Poor’s (S&P) Index from 1926 to 1993. The 
author found that the distribution of the first significant digit of the returns roughly 
follows Benford’s law. Similar results have been obtained for stock prices on single 
trading days by Zhipeng et al. (see [12]). 

An idea analogous to the one traced in the previous section, namely that the 
detection of a shunt from Benford’s law might be a symptom of data manipulation, 
has been used in tax-fraud detection by Nigrini (see [10]), and in fraudulent economic 
and scientific data by Gunnel et al. and by Diekmann, respectively (see [5] and [4]). 

Benford’s law has been used also to discuss tests concerning the presence of 
“psychological barriers” and of “resistance levels” in stock markets. In particular 
De Ceuster et al. (see [3]) claimed that differences of the distribution of digits from 
uniformity are a natural phenomenon; as a consequence they found no support for the 
psychological barriers hypothesis. 

All these different financial applications support the idea that in financial markets 
that are not “altered”, Benford’s law holds. 


4 Do the S&P 500’s stocks satisfy Benford’s law? 


The data set we consider consists of 3067 daily close prices and 3067 daily close 
logarithmic returns for 361 stocks belonging to the S&P 500 market,* from August 
14, 1995 to October 17, 2007. The analysis we perform proceeds along three steps: 


— in the first one we investigate the overall probability distribution of the first sig- 
nificant digit both on the whole data set of prices and on the whole data set of 
returns; 

— inthe second step we study the day-by-day distribution of the first significant digit 
of returns; 

— finally, in the third step we analyse the sequences of consecutive days in which 
the distribution of the first significant digit of returns does not follow Benford’s 
law, i.e., the consecutive days in which anomalous behaviours happen. 


4 In this analysis we take into account only the S&P 500 stocks that are listed for each of the 
days belonging to the investigation period. 
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4.1 Overall analysis 


Here we compare the overall probability distributions of the first significant digit of 
the considered prices and returns against Benford’s law and the uniform probability 
distribution (see Fig. 1) by means of the chi-square goodness-of-fit test. Uniform 
probability distribution is used as the (intuitive) benchmark alternative to the (coun- 
terintuitive) Benford’s law. 

At a visual inspection, both the empirical probability distributions seem to be 
rather Benford-like (in particular, the one associated to returns). Nevertheless, in 
both the comparisons the null is rejected. In Table 1 we report the values of the 
associated chi-square goodness-of-fit tests with 8 degrees of freedom (we recall that 
Xg0.95 = 15.51). 

From a qualitative point of view, our results are analogous to the ones obtained 
by Ley (see [8]). In particular, that author observed that, despite the fact that the 
chi-square goodness-of-fit tests on DJIA and S&P Indexes suggest rejection of the 
null, this was due to the large number of observations considered. In fact, the same 
kind of analysis performed only on 1983-1993 data suggested acceptance of the 
null. Moreover, the rejection with respect to the uniform probability distribution is 
stronger and stronger than the rejection with respect to Benford’s law. In other words, 
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Fig. 1. Overall empirical probability distributions 


Table 1. Overall calculated chi-square 


Reference probability distribution 2 w.r.t. prices x2 w.r.t. returns 


Benford 151527.74 7664.84 
Uniform 780562.24 673479.62 
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looking at chi-square as a distance, the empirical probability distributions are closer to 
Benford’s law than to the uniform probability distribution. In this sense we agree with 
Ley (see [8]) claiming that the distributions of the first significant digit of prices and 
returns essentially follow Benford’s law. In other terms, the S&P 500 stock market 
behaviour as a whole in the period August 14, 1995 to October 17, 2007 can be 
considered as “ordinary”. 

Finally, we observe that the empirical probability distribution related to returns is 
significantly closer to Benford’s law than the empirical probability distributionrelated 
to prices. In particular, the latter is 19.77 times further away from Benford’s law than 
the former. This evidence is theoretically coherent with that stated in the paper of 
Pietronero et al. (see [11]), since logarithmic returns are obtained from prices by a 
multiplicative process. 


4.2 Day-by-day analysis 


Here, we address our attention to returns since their empirical probability distribution 
is closer to Benford’s law than that of prices. We day-by-day perform the same kind 
of analysis considered in the previous subsection, but only with respect to Benford’s 
law. 

Over the investigated 3067 days, the null is rejected 1371 times, i.e., in about 
44.70% of cases. In Figure 2 we represent the values of the day-by-day calculated chi- 
square goodness-of-fit tests (the horizontal white line indicates the value of Xé.0.05)" 


Day-by-day analysis with respect to returns: 14 August, 1995 — 17 October, 2007 
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Fig. 2. Day-by-day calculated chi-square 
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We deepen our analysis taking into account also a confidence level a equal to 1% (we 
recall that X$.0.99 = 20.09); it results that the null is rejected 890 times, i.e., in about 
29.02% of cases. 

In order to check if such rejection percentages are reasonable, we perform the 
following computational experiment: 


— First, for each of the considered 361 stocks we generate a simulated time series of 
its logarithmic returns which has the same length as the original time series, and 
whose probability distribution follows a Gaussian one with mean and variance 
equal to the real ones estimated for the stock? (the Gaussian probability distribu- 
tion is chosen for coherence with the classical theory of financial markets); 

— Second, we perform a day-by-day analysis on the generated financial market in 
the same way as for the true financial market. 


Repeating the experiment 50 times, we obtain the following mean values of the 
rejection percentages: 57.92% if a = 5% (about 1776 cases) and 33.50% if a = 1% 
(about 1027 cases). This results have not to be considered particularly surprising. In 
fact, to each of the considered stocks we associate always the same kind of probability 
distribution, the Gaussian one, instead of selecting it at random as would be required 
to obtain a Benford distribution (see section 2). 

The fact that the rejection percentages in aclassical-like market are greater than the 
corresponding percentages in the true one denotes that a certain number of deviations 
from Benford’s law, i.e., a certain number of days in which the financial market is 
not ordinary working, is physiological. Moreover, the significant differences between 
rejection percentages concerning the classical-like market and the true one can be 
interpreted as a symptom of the fact that, at least from a distributional point of view, 
the true financial market does not always follow what is prescribed by the classical 
theory. 

In Table 2 we report the 45 most rejected days at a 5% significance level with 
the corresponding values of the chi-square goodness-fit-of tests with 8 degrees of 
freedom. 

We notice that some of the days and periods reported in Table 2 are characterised 
by well known events. For instance, the Wall Street crash on February, 2007 (the most 
rejected day) and the troubles of important hedge funds since 2003 (24.44% of the first 
45 most rejected days falls in 2003). Nevertheless, in other rejection days/periods the 
link with analogous events cannot generally be observed. In such cases the day-by- 
day analysis can be profitaby used to detect hidden anomalous behaviours in financial 
markets. On the other hand, the most accepted day is September 5, 1995, whose value 
of the chi-square goodness-fit-of test is 0.91. In Figure 3 we graphically compare the 
empirical probability distributions of the most rejected and of the most accepted days 
against Benford’s law and the uniform probability distribution. 


5 In generating this simulated financial market, we do not consider the correlation structure 
existing among the returns of the various stocks because, during the investigation period, 
such a structure does not appear particularly relevant. So, the simulated financial market 
can be reckoned as a reasonable approximation of the true one. 
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Table 2. Most rejected days and related calculated chi-square 


Rank Day 
1 02.27.2007 
2 07.29.2002 
3. 01.02.2003 
4 03.24.2003 
5 01.24.2003 
6 = 10.27.1997 
7 ~~ 03.13.2007 
8 08.29.2007 
9 07.24.2002 
10 03.08.1996 
11 08.06.2002 
12 08.03.2007 
13. 08.28.2007 
14. = 05.10.2007 
15 09.03.2002 


2 


Rank 


16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 


Day 2 


10.01.2002 100.19 
08.04.1998 99.96 
03.17.2003 98.81 
06.07.2007 96.93 
06.17.2002 93.67 
12.27.2002 92.29 
08.05.2002 90.05 
03.30.2005 86.39 
04.14.2003 85.39 
02.24.2003 84.14 
08.08.2002 84.05 
03.02.2007 82.86 
05.25.2004 81.47 
06.13.2007 81.47 
09.07.2007 81.29 


Rank 


31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 


Day x 


06.22.2007 80.86 
09.26.2002 78.90 
03.10.2003 78.44 
10.01.2003 77.21 
03.21.2003 75.63 
04.14.2000 75.22 
06.05.2006 74.57 
07.20.2007 74.27 
08.22.2003 74.10 
02.22.2005 73.40 
08.05.2004 72.96 
05.30.2003 72.24 
07.10.2007 = 71.97 
04.11.1997 71.44 
10.28.2005 = 71.23 
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Fig. 3. Empirical probability distributions of the most rejected and most accepted days 


Finally, we spend a few words on the day of the Twin Towers attack, which 
has been chosen as the central one of the data set. We remark that 42.89% of all 
the rejected days falls before this day and 57.11% of them after. Moreover, if we 
limit our attention to the first 45 most rejected days, the difference between such 
percentages considerably increases to 11.11% before and 88.89% after, respectively. 
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These results show that the S&P 500 stock market was subject to anomalous activities 
after September 11, 2001 rather than before. 


4.3 Consecutive rejection days analysis 


Here, addressing our attention to returns once more, we analyse the sequences of 
consecutive rejection days detected using a = 5%. In Table 3 we report the number 
of such sequences having lengths from 1| day to 12 days, respectively (12 days is the 
maximum length detected in the investigation period). For deepening the investigation, 
we also report the results obtained using a = 1%. 

We observe that the length of the large majority of the sequences of consecutive 
rejection days is rather low. This fact can be interpreted as the capability of the S&P 
500 stock market to “absorb” anomalous events in short time periods. 

On the contrary, given such a capability, the presence of long sequences of con- 
secutive rejection days is an indicator of malaise of the market. For instance, this is 
the case of a 9-day sequence (September 20, 2001 to October 2, 2001) that started im- 
mediately after the Twin Towers attack and of a 6-day sequence (February 27, 2007 to 
March 6, 2007) that started on the day of the Wall Street crash. Moreover, analogously 
to what we already observed in the previous subsection, since the events/causes as- 
sociated to such sequences are not always observable, the consecutive rejection days 
analysis might be profitably used for detecting continued anomalous behaviours in 
financial markets. 


Table 3. Sequences of consecutive rejection days 


Sequence length # with a = 5% # with a = 1% 


1 412 366 

2 148 105 

3 71 52 

4 48 12 

5 24 7 

6 13 2 

i 7 1 

8 1 0 

9 3 0 

10 1 1 
11 0 0 
1 0 


a 
N 
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5 Conclusions 


To the best of our knowledge, several aspects concerning the use of Benford’s law- 
based analyses in financial markets have not yet been investigated. Among the various 
ones, we consider the following: 


Given the few studies on this topic, the actual capability of this kind of approach 
to detect anomalous behaviours in financial markets has to be carefully checked 
and measured. To this end, the systematic applications of these approaches to a 
large number of different financial markets is needed; 

From a methodological point of view, we guess that restricting the analysis we 
performed in this paper to the different sectors compounding the financial market 
could be useful for detecting, in the case of anomalous behaviours of the market 
as a whole, which sectors are the most reliable; 

We guess also that, in order to make this analysis more careful, we should at 
least take into account the probability distribution of the second significant digit 
(see [6]), 1.e., 


1 


9 
Pr(second significant digit = d) = log SS (1 + —— 
ei 10k+d 


). d=0,...,9;° 


Finally, the results we presented in this paper are ex post. Currently, we are be- 
ginning to develop and apply a new Benford’s law-based approach in order to 
check some predictive capabilities. The first very preliminary results seem to be 
encouraging. 
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Empirical likelihood based nonparametric testing for 
CAPM 


Pietro Coretto and Maria Lucia Parrella 


Abstract. The Capital Asset Pricing Model (CAPM) predicts a linear relation between assets’ 
return and their betas. However, there is empirical evidence that such a relationship does not 
necessarily occur, and in some cases it might even be nonlinear. In this paper we explore 
a nonparametric approach where the linear specification is tested against a nonparametric 
alternative. This methodology is implemented on S&P500 data. 


Key words: CAPM, goodness-of-fit test, empirical likelihood 


1 Introduction 


An asset pricing model provides a method for assessing the riskiness of cash flows 
from a project. The model provides an estimate of the relationship between that risk- 
iness and the cost of capital. According to the “capital asset pricing model” (CAPM), 
the only relevant measure of a project’s risk is a variable unique to this model, known 
as the project’s beta. In the CAPM, the cost of capital, i-e., the return, is a linear func- 
tion of the the beta of the project being evaluated. A manager who has an estimate 
of the beta of a potential project can use the CAPM to estimate the cost of capital for 
the project. If the CAPM captures investors’ behaviour adequately, then the historical 
data should reveal a positive linear relation between return on financial assets and 
their betas. Also, no other measure of risk should be able to explain the differences in 
average returns across financial assets that are not explained by CAPM betas. The fact 
that CAPM theory predicts the existence of a cross-section linear relation between 
returns and betas can be empirically tested. To this end we propose a nonparametric 
testing methodology (see [10] and [3] among others). 

The first test of the CAPM was run by Fama and MacBeth [7] and their study 
validated the theory. The authors tested the linearity against some parametric nonlinear 
alternatives. However subsequent empirical analysis highlighted that the validity of 
the CAPM could depend on the testing period. There is a huge amount of literature 
on this topic (for a comprehensive review see [11]), however, final conclusions have 
not been made. 


M. Corazza et al. (eds.), Mathematical and Statistical Methods for Actuarial Sciences and Finance 
© Springer-Verlag Italia 2010 


104 P. Coretto and M.L. Parrella 


The famous Fama-MacBeth contribution (and the following) tests the linear spec- 
ification against a number of nonlinear parametric specifications. The main contri- 
bution of this paper is that we test the linear specification of the CAPM against 
a nonlinear nonparametric specification. And by this we do not confine the test to 
a specific (restricting) nonlinear alternative. Our testing method is based on kernel 
smoothing to form a nonparametric specification for the null hypothesis that the re- 
lation between returns and betas is linear against the alternative hypothesis that there 
is a deviation from the linearity predicted by the CAPM. We apply our methodology 
to the S&P 500 market. 

The paper is organised as follows: we introduce the theoretical model, we intro- 
duce the Fama and MacBeth two-stage parametric estimation procedure, we outline 
the nonparametric testing methodology and finally we discuss some empirical findings 
based on the analysis of the S&P 500 market. 


2 The CAPM in a nutshell 


CAPM was first developed by Sharpe and Treynor; Mossin, Lintner and Black brought 
the analysis further. For a comprehensive review see [5] and [1]. We will refer to SLB 
as the Sharpe-Lintner-Black version of the model. The SLB model is based on the 
assumption that there is a positive trade-off between any asset’s risk and its expected 
return. In this model, the expected return on an asset is determined by three variables: 
the risk-free rate of return, the expected return on the market portfolio and the asset’s 
beta. The last one is a parameter that measures each asset’s systematic risk, i.e., the 
share of the market portfolio’s variance determined by each asset. 


2.1 Theoretical model 


The CAPM equation is derived by imposing a number of assumptions that we discuss 
briefly. An important building-block of the CAPM theory is the so-called perfect 
market hypothesis. This is about assuming away any kind of frictions in trading and 
holding assets. Under the perfect market hypothesis, unlimited short-sales of risky 
assets and risk-free assets are possible. 

The second assumption is that all investors choose their portfolios based on mean 
(which they like) and variance (which they do not like). This assumption means that 
people’s choices are consistent with Von-Neumann- Morgenstern’s axiomatisation. 

All investors make the same assessment of the return distribution. This is referred 
to as “homogenous expectations”. The implication of this hypothesis is that we can 
draw the same minimum-variance frontier for every investor. 

Next is the “market equilibrium” hypothesis (i.e., supply of assets equals demand). 
The market portfolio is defined as the portfolio of assets that are in positive net supply, 
weighted by their market capitalisations. Usually it is assumed that the risk-free 
instrument is in zero net supply. On the demand side, the net holdings of all investors 
equal aggregate net demand. The last assumption states that all assets are marketable, 
1.e., there is a market for each asset. 
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On the basis of these assumptions we can derive a model that relates the expected 
return of a risky asset with the risk-free rate and the return of market portfolio; in 
the latter, all assets are held according to their value weights. We will denote R; a 
random variable that describes the return of risky asset j.Let R f be the risk-free rate 
and Ry the return on market portfolio. Under the assumptions above and assuming 
that the following expectations exist, the theory of CAPM states that there exists the 
following relation: 


ELR;] = ELR,] + Bj (ELRw] - ELR/1) (1) 


The term f; in the CAPM equation (1) is the key to the whole model’s implications. 
fj represents the risk asset j contributes in the market portfolio, measured relative to 
the market portfolio’s variance: 


_ Cov[Rj, Ru] 


Pi = Ral 


(2) 


f is a measure of systematic risk: since it is correlated with the market portfolio’s 
variance and the market portfolio is efficient, an investor cannot possibly diversify 
away fromit. The theory predicts that each asset’s return depends linearly on its beta. 
Notice that the CAPM equation is a one-period model; this means that this equation 
should hold period by period. In order to estimate and test the CAPM equation date 
by date, we need to make further assumptions in order to estimate the betas first. 


2.2 Testing strategy 


The beauty of the CAPM theory is that in order to predict assets’ return we only 
need information about prices and no further expensive information is needed. The 
tests conducted over the last 45 years have brought up different issues and contrasting 
views and results. Whereas the first test found no empirical evidence for the theory of 
equilibrium asset prices, a very famous test, conducted in 1973 by Fama and MacBeth 
(see [7]), provided evidence in favour of the validity of the SLB-CAPM model. How- 
ever, later studies (e.g., Fama and French [6]) have challenged the positive and linear 
relationship between betas and returns (i.e., CAPM’s theory’s main conclusion) by 
introducing other variables which proved to have a much greater explanatory power 
but at some costs. 

The main contribution of this paper is a nonparametric test about the linear specifi- 
cation of the CAPM. Our nonparametric testing approach is based on the comparison 
between the predicted returns obtained via the parametric linear model implied by the 
CAPM and the returns predicted by a kernel estimator. This testing strategy implies 
two steps: the first step (or “parametric step”) is to estimate the predicted returns 
based on the CAPM equation; the second step (or nonparametric step) is to predict 
the returns on the basis of a kernel regression. We describe the two steps in detail. 
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2.3 The parametric step 


We complete the first step by using the same methodology developed by Fama and 
MacBeth [7]. In order to apply this methodology we need to make some further as- 
sumptions. The CAPM is a model of expected returns in a one-period economy. What 
we actually observe, though, is a time series of asset prices and other variables from 
which we can compute the realised returns over various holding periods. We need 
to assume that investors know the return distribution over one particular investment 
period. In order to estimate the parameters of that distribution it is convenient to 
assume that the latter is stationary. In addition, we assume that returns are drawn 
independently over time. Although the last assumption appears to be too strong, sev- 
eral empirical studies proved that this cannot seriously affect the first-step estimation 
(see [7]). The latter comment applies in particular when short sequences of daily 
returns are used to estimate the betas (see below). 

What about the “market portfolio”? Can the market portfolio be easily identified? 
It is worth remembering that the CAPM covers all marketable assets and it does not 
distinguish between different types of financial instruments. This is the focus of Roll’s 
Critique [14]. As a market proxy we will use the S&P500 index. The CAPM provides 
us with no information about the length of the time period over which investors choose 
their portfolios. It could be a day, a month, a year or a decade. 

Now we describe the Fama-MacBeth estimation methodology. We have a time 
series of assets’ prices recorded in some financial market. Let us assume that Rj, is 
the log-return at time ¢ for the asset 7, where 7 = 1,2,...,S and¢ = 1,2,...,T. 
Let Ry, be the market log-return at time ¢. The relation (1) has to hold at each rf for 
each asset. We have to estimate the CAPM for each t. To do the latter we need a time 
series of fs. 

The first stage is to obtain a time series of estimated betas based on a rolling 
scheme. For each asset j = 1, 2,..., S, and for fixed w and p = 1,2,..., T—w+1, 
we take the pairs {Rj,, Rut }:=p,p+l.,...p+w—1 and we estimate the market equation 


Rj =4j,p + Bj,pRuys + €ju, (3) 


where {é j,;}:=p, p+1,.., ptw—1 18 an 1.i.d. sequence of random variables with zero mean 
and finite variance. The (3) is estimated for each j = 1,2,..., S, to obtain 6; for 
periods p = 1,2,..., T—w-+1. The estimated Bi, p is the estimate of the systematic 
risk of the jth asset in period p. From this first regression we also store the estimated 


standard deviation of the error term, say dj.) = J Var (€;,1). The latter is a measure 
of the unsystematic risk connected to the jth asset in period p. The use of oj, will 
be clear afterwards. 

In the second stage for each period p = 1,2,..., 7 — w+ 1 we estimate the 
linear model implied by the CAPM applying a cross-section (across j = 1, 2,..., 8) 
linear regression of assets’ returns on their estimated betas. For each period p = 
1,2,..., 7 — w+ 1 the second-stage estimation is: 


Ry pti =i +P Bip t+ Eh aa (4) 
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where {ey a j=i,2,...,$ 18 an id.d. sequence of random variables having zero mean 


and finite variance. Notice that we regress Rj,p41 on Bi, p; this is because it is 
assumed that investors base current investment decisions on the most recent available 
f. The Fama-MacBeth testing procedure consisted in testing the linear relation (4) for 
each period p. If the linear model (4) holds in period p, that means that the model (1) 
statistically holds in period p. For along time it has been thought that when the CAPM 
fails this is due to the fact that unsystematic risk affects returns as well as possible 
nonlinearities in betas. Two further second-stage equations have been considered to 
check for the aforementioned effects. The first alternative to (4) is: 


Ripti =e +72 Bip +3 Sip + See (5) 


where we add a further regressor which is the unsystematic risk measure. The second 
alternative is: 

Rj pti = 90 + BF +x 8)p + Ee pas (6) 
where the betas enter in the peeresson squared. In (5) and (6) we assume that the 
errors {E? hig 1,2,...,8 and {er } j=1,2,...,8 are two i.i.d. sequence of random variables 
with zero mean and eas svariance As for (4), (5) and (6) are also estimated for each 
period p = 1,2,..., 7 — w+ 1. The models A, B and C represented by equations 
(4),(5) and (6) are eee and tested in the famous paper by Fama and MacBeth [7]. 


3 The nonparametric goodness-of-fit test 


In this section we apply the method proposed by Hardle et al. [8] for testing the linear 

specification of the CAPM model, that is, models A, B and C defined in the second 

stage of the previous step. This goodness-of-fit test is based on the combination of 

two nonparametric tools: the Nadaraya-Watson kernel estimator and the Empirical 

Likelihood of Owen [13]. Here we briefly describe the testing approach, then, in the 

next section, we apply it to the CAMP model estimated on the S&P500 stock market. 
Let us consider the following nonparametric model 


Rj, pti =m (Xj,p) + ej,p41 3 _ ses —_ ; 561 (7) 
where Rj,p+1 is the log-return of period p for the asset j, Xj,» € R¢ is the vector 
of d regressors observed in period p for the asset j and e;,»+1 is the error, for which 
we assume that E(e;,»+41|Xj,p) = 0 for all j. We also assume that the regressors 
Xj,» and the errors €;,» are independent for different js, but we allow some condi- 
tional heteroscedasticity in the model. The main interest lies in testing the following 
hypothesis 


Ho : m(x) =m, (x) =» "x versus H,:m(x)¢ my (x), (8) 


where my, (x) = y Tx is the linear parametric model and y is the vector of unknown 
parameters belonging to a parameter space T € R¢+!. Let us denote with ms (x) the 
estimate of my (x) given by a parametric method consistent under Ho. 
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The Nadaraya-Watson estimator of the regression function m(x) is given by 


s 
jai Ri,pt1 Kn (x — Xj,p) 


(9) 
SS Kn(x — Xj,p) 


mn (x) = 


where K;,(u) = h~¢K (h—'u) and K is ad-dimensional product kernel, as defined in 
[8]. The parameter h is the bandwidth of the estimator, which regulates the smoothing 
of the estimated function with respect to all regressors. We use acommon bandwidth 
because we assume that all the regressors have been standardised. 

When applied to kernel estimators, empirical likelihood can be defined as follows. 
For a given x, let p;(x) be nonnegative weights assigned to the pairs (Xj,», Rj,p+1), 


for j = 1,..., S. The empirical likelihood for a smoothed version of m/ (x) is defined 
as 
S 
L{mns (x)} = max } |] pi}. (10) 
j=l 


where the maximisation is subject to the following constraints 


S s _xX, 
DSPi@W=1 > pjyk (>i) [Ripti—mp@]J=0. CD) 
j=l j=l 


As is clear from equation (11), the comparison is based on a smoothed version of the 
estimated parametric function m; (x) (see [8] for a discussion), given by 


: jai ms (Xj,p)Kn(x — Xj,p) 
my «) = (12) 
ei Ki, (x — Xj,p) 
By using Lagrange’s method, the empirical log-likelihood ratio is given by 
{its (x)} = —2 log[L {ins (x)} 8°]. (13) 
Note that S° comes from the maximisation in (10), since the maximum is achieved 
at pj(x) = so), 


Theorem 1. Under Ho and the assumptions A.1 in [8], we have 
ts d 
I {ras (®)} > xP. (14) 


Proof (sketch). The proof of the theorem is based on the following asymptotic equiv- 
alence (see [4] and [8]) 


ss 2 2 

rae dy1/2 {rin (x) — my (x)} 
1 {m3 (x)} [ist Via |? (15) 
where V (x; /) is the conditional variance of Rj, p41 given Xj,» = x. For theorem 3.4 
of [2], the quantity in brackets is asymptotically N(O, 1). O 
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As shown in theorem 1, the empirical log-likehood ratio is asymptotically equivalent 
to a Studentised L2-distance between my, (x) and 1p (x), so it may be compared to the 
statistic tests used in [9] and [12]. The main attraction of the test procedure described 
here is its ability to automatically studentising the statistic, so we do not have to 
estimate V (x; /), contrary to what happens with other nonparametric goodness-of-fit 
tests. Based on Theorem | and on the assumed independence of the regressors, we 
use the following goodness-of-fit test statistic 


S* 


Di! {ij &p)} (16) 


f=1 


which is built on a set of S* < S points x,,», selected equally spaced in the support of 
the regressors. The statistic in (16) is compared with the percentiles of a y* distribution 
with S* degrees of freedom. 


4 Empirical results and conclusions 


In this section we discuss some results obtained by estimating the models A, B and C 
presented in equations (4), (5) and (6) and we apply the nonparametric step to test the 
linearity of such models. Note that here we consider specifically the linear functions 
under the null, but the hypotheses stated in (8) might refer to other functional forms 
for my (x). 

The market log-return is given by the S&P500 index, while the asset log-returns 
are the S = 498 assets included in the S&P stock index. The time series are observed 
from the 3rd of January 2000 to the 31st of December 2007, for a total of 1509 time 
observations. We consider three different rolling window lengths, that is w = 22, 
66 and 264, which correspond roughly to one, three and twelve months of trading. 
The total number of periods in the cross-section analysis (second stage of the Fama 
and MacBeth method) is 1487 when w = 22, 1443 when w = 66 and 1245 when 
w = 264. 

For each asset, we estimate the coefficients Bigs j=1,...,S,p=H1,...,T7- 
w+ 1 from equation (3). We obtain a matrix of estimated betas, of dimension (1510 — 
w, 498). For each resulting period we estimate the cross-section models A, B and C 
and we apply the nonparametric testing scheme. The assumptions A.1 in [8] are 
clearly satisfied for the data at hand. The bandwidth used in the kernel smoothing in 
(9), (11) and (12) has been selected automatically for each period p, by considering 
optimality criteria based on a generalised cross-validation algorithm. In (16) we have 
considered S* = 30 equally spaced points. It is well known that kernel estimations 
generally suffer from some form of instability in the tails of the estimated function, 
due to the local sparseness of the observations. To avoid such problems, we selected 
the S* points in the internal side of the support of the regressors, ranging on the central 
95% of the total observed support. 

In Table 1 we summarise the results of the two testing procedures (parametric and 
nonparametric) described in previous sections. 
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Table 1. Percentage of testing periods for each specified model and window when cases 1—4 
occur. Cases 1-4 are as follows. Case 1: the estimated linear coefficients in the parametric 
step are jointly equal to zero at level a = 5%, and we do not reject the Ho hypothesis in 
the nonparametric stage at the same level; Case 2: the estimated linear coefficients in the 
parametric step are jointly equal to zero at level a = 5%, and we reject the Hp hypothesis 
in the nonparametric stage at the same level; Case 3: the estimated linear coefficients in the 
parametric step are jointly different from zero at level a = 5%, and we do not reject the Ho 
hypothesis in the nonparametric stage at the same level; Case 4: the estimated linear coefficients 
in the parametric step are jointly different from zero at level a = 5%, and we reject the Ho 
hypothesis in the nonparametric stage at the same level 


Window Regressors (model) 
B B,o Bo 
(Model A) (Model B) (Model C) 
Case | 
w = 22 49.42 45.595 57.78 
w = 66 43.87 44.144 51.195 
w = 264 43.449 43.213 52.099 
Case 2 
w = 22 31.141 13.114 23.925 
w = 66 38.41 16.078 28.373 
w = 264 38.41 12.53 28.148 
Case 3 
w= 22 10.155 26.564 9.695 
w = 66 8.238 23.909 10.409 
w = 264 9.183 25.141 9.465 
Case 4 
w = 22 9.284 14.728 8.6 
w = 66 9.483 15.87 10.023 
w = 264 8.959 19.116 10.288 


As Case | we label the percentage of testing periods where the estimated linear 
coefficients in the parametric step are jointly equal to zero at the testing level a = 5%, 
and we do not reject the Hp hypothesis (e.g., the linear relation statistically holds) in 
the nonparametric stage at the same level. This case is of particular interest because 
the percentage of testing periods when it occurs is not smaller than 43% for all rolling 
windows and all sets of regressors. If we combine the results of the two stages (both 
the parametric and the nonparametric) this means that in almost half of the testing 
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periods we validate the linearity of the regression function, but this is probably a 
constant. 

As Case 2 we label the percentage of testing periods where the estimated linear 
coefficients in the parametric step are jointly equal to zero at level a = 5%, and we 
reject the Ho hypothesis in the nonparametric stage at the same level. That is, no 
linear relationship can be detected but there is some evidence of nonlinear structures. 

As Case 3 we report the percentage of testing periods where the estimated linear 
coefficients in the parametric step are jointly different from zero at level a = 5%, and 
we do not reject the Ho hypothesis in the nonparametric stage at the same level. The 
case is in favour of the CAPM theory because here we are saying that the estimated 
linear coefficients in the parametric step are jointly different from zero at level a = 
5%, and we do not reject the linearity hypothesis (Ho) in the nonparametric stage at 
the same level. This means that when this situation occurs we are validating the idea 
behind the CAPM, that is: historical information on prices can be useful to explain the 
cross-section variations of assets’ returns. For this case the best occurs for model B, 
which is the one that uses the pair of regressors (/, 0). The statistical conclusion we 
draw from case 3 is that when w = 22 for 26.6% of the testing periods we validate the 
linear relation between assets’ returns, /s and the nonsystematic risk (0 ). Moreover 
this result does not depend on the rolling window (even though w = 66 produces 
slightly better results). 

Finally, as Case 4 we label the percentage of testing periods where the esti- 
mated linear coefficients in the parametric step are jointly different from zero at level 
a = 5%, and we reject the Ho hypothesis in the nonparametric stage at the same 
level. That is, a relationship is present but the nonparametric testing step supports the 
evidence for nonlinear effects. It is worthwhile to observe that for all the cases con- 
sidered, conclusions do not seem to be related to the particular choice of the rolling 
window. They remain basically stable when moving across different choices. Despite 
the limitations of the present analysis the open question remains whether the betas 
are a determinant of the cross-section variations of assets’ return at all. From this 
study we cannot conclude that model B is validated. But certainly, this occurs in 
approximately one quarter of the testing periods. This encourages us to investigate 
other possibilities that could reveal stronger paths in the data. There are several issues 
that would be worth investigating further: dependence structures, group structures in 
the risk behaviour of assets, robustness issues and nonparametric specifications of the 
first stage. 
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Lee-Carter error matrix simulation: 
heteroschedasticity impact on actuarial valuations 


Valeria D’ Amato and Maria Russolillo 


Abstract. Recently a number of approaches have been developed for forecasting mortality. In 
this paper, we consider the Lee-Carter model and we investigate in particular the hypothesis 
about the error structure implicitly assumed in the model specification, i.e., the errors are ho- 
moschedastic. The homoschedasticity assumption is quite unrealistic, because of the observed 
pattern of the mortality rates showing a different variability at old ages than younger ages. 
Therefore, the opportunity to analyse the robustness of estimated parameter is emerging. To 
this aim, we propose an experimental strategy in order to assess the robustness of the Lee-Carter 
model by inducing the errors to satisfy the homoschedasticity hypothesis. Moreover, we apply 
it to a matrix of Italian mortality rates. Finally, we highlight the results through an application 
to a pension annuity portfolio. 


Key words: Lee-Carter model, mortality forecasting, SVD 


1 Introduction 


The background of the research is based on the bilinear mortality forecasting methods. 
These methods are taken into account to describe the improvements in the mortality 
trend and to project survival tables. We focus on the Lee-Carter (hereinafter LC) 
method for modelling and forecasting mortality, described in Section 2. In particular, 
we focus on a Sensitivity issue of this model and in order to deal with it, in Section 3, 
we illustrate the implementation of an experimental strategy to assess the robustness 
of the LC model. In Section 4, we run the experiment and apply it to a matrix of Italian 
mortality rates. The results are applied to a pension annuity portfolio in Section 5. 
Finally, Section 6 concludes. 


2 The Lee-Carter model: a sensitivity issue 


The LC method is a powerful approach to mortality projections. The traditional LC 
model analytical expression [7] is the following: 


In (Mx) =Axy+ Bxkt a Ext, (1) 


M. Corazza et al. (eds.), Mathematical and Statistical Methods for Actuarial Sciences and Finance 
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describing the log of a time series of age-specific death rates m,,, as the sum of 
an age-specific parameter independent of time a, and a component given by the 
product of a time-varying parameter x;, reflecting the general level of mortality and 
the parameter /,., representing how rapidly or slowly mortality at each age varies 
when the general level of mortality changes. The final term F,,; is the error term, 
assumed to be homoschedastic (with mean 0 and variance a2). 

On the basis of equation (1), if M x,t 18 the matrix holding the mean centred 
log-mortality rates, the LC model can be expressed as: 


My. =In (Mx.1) ay = Bxkt ob Ext. (2) 


Following LC [7], the parameters /,. and x; can be estimated according to the Sin- 
gular Value Decomposition (SVD) with suitable normality constraints. The LC model 
incorporates different sources of uncertainty, as discussed in LC [8], Appendix B: un- 
certainty in the demographic model and uncertainty in forecasting. The former can 
be incorporated by considering errors in fitting the original matrix of mortality rates, 
while forecast uncertainty arises from the errors in the forecast of the mortality index. 
In our contribution, we deal with the demographic component in order to consider 
the sensitivity of the estimated mortality index. In particular, the research consists 
in defining an experimental strategy to force the fulfilment of the homoschedasticity 
hypothesis and evaluate its impact on the estimated x;. 


3 The experiment 


The experimental strategy introduced above, with the aim of inducing the errors to 
satisfy the homoschedasticity hypothesis, consists in the following phases [11]. The 
error term can be expressed as follows: 


Ext = Myr — Bk, (3) 


i.e., as the difference between the matrix M x,t referring to the mean centred log- 
mortality rates and the product between /,, and x; deriving from the estimation of the 
LC model. The successive step consists in exploring the residuals by means of statis- 
tical indicators such as: range, interquartile range, mean absolute deviation (MAD) of 
a sample of data, standard deviation, box-plot, etc. Afterward, we proceed in finding 
those age groups that show higher variability in the errors. Once we have explored 
the residuals E. x,t We may find some non-conforming age groups. We rank them 
according to decreasing non-conformity, i.e., from the more widespread to the more 
homogeneous one. For each selected age group, it is possible to reduce the variability 
by dividing the entire range into several quantiles, leaving aside each time the fixed 
a% of the extreme values. We replicate each running under the same conditions a 
large number of times (i.e., 1000). For each age group and for each percentile, we 
define a new error matrix. The successive runnings give more and more homogeneous 
error terms. By way of this experiment, we investigate the residual’s heteroschedas- 
ticity deriving from two factors: the age group effect and the number of altered values 
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in each age group. In particular, we wish to determine the hypothetical pattern of 
x; by increasing the homogeneity in the residuals. Thus, under these assumptions, 
we analyse the changes in x; that can be derived from every simulated error matrix. 
In particular, at each running we obtain a different error matrix E x,t> Which is used 
for computing a new data matrix M,, t, from which it is possible to derive the cor- 
respondent x;. To clarify the procedure analytically, let us introduce the following 
relation: 


[Met = Ex] _ Mx => Bxkt, (4) 


where Mx, ; 18 a new matrix of data obtained by the difference between M x,t (the 
matrix holding the raw mean centred log mortality rates) and E x,t (the matrix holding 
the mean of altered errors). From Mx, if £,, is fixed, we obtain the x; as the ordinary 
least square (OLS) coefficients of a regression model. We replicate the procedure by 
considering further non-homogenous age groups with the result of obtaining at each 
step a new x;. We mean to carry on the analysis by running a graphical exploration 
of the different «, patterns. Thus, we plot the experimental results so that all the x;’s 
are compared with the ordinary one. Moreover, we compare the slope effect of the 
experimental x; through a numerical analysis. 


4 Running the experiment 


The experiment is applied to a data matrix holding the Italian mean centred log- 
mortality rates for the male population from 1950 to 2000 [6]. In particular, the rows 
of the matrix represent the 21 age groups [0], [1-4], [5-9], ..., [95-99] and the 
columns refer to the years 1950-2000. Our procedure consists of an analysis of the 
residuals’ variability through some dispersion indices which help us to determine the 
age groups in which the model hypothesis does not hold (see Table 1). 

We can notice that the residuals in the age groups 1-4, 5-9, 15-19 and 25-29 
(written in bold character) are far from being homogeneous. Thus the age groups 
1-4, 15-19, 5-9, 25-29 will be sequentially, and according to this order, entered in 
the experiment. Alongside the dispersion indices, we provide a graphical analysis by 
displaying the boxplot for each age group (Fig. 1), where on the x-axis the age groups 
are reported and on the y-axis the residuals’ variability. If we look at the age groups 
1-4 and 15-19 we can notice that they show the widest spread compared to the others. 
In particular, we perceive that for those age groups the range goes from —2 to 2. 

For this reason, we explore to what extent the estimated x; are affected by such a 
variability. A way of approaching this issue can be found by means of the following 
replicating procedure, implemented in a Matlab routine. For each of the four age 
groups we substitute the extreme residual values with the following six quantiles: 
5%, 10%, 15%, 20%, 25%, 30%. Then we generate 1000 random replications (for 
each age group and each interval). From the replicated errors (1000 times x 4 age 
groups x 6 percentiles) we compute the estimated x; (6 x 4 x 1000 times) and then 
we work out the 24 averages of the 1000 simulated x;. In Figure 2 we show the 24,000 
estimated x; through a Plot-Matrix, representing the successive age groups entered in 
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Table 1. Different dispersion indices to analyse the residuals’ variability 


Age IQ Range MAD Range STD 
0 0.107 0.059 0.300 0.075 
1-4 2.046 0.990 4.039 1.139 
5-9 1.200 0.565 2.318 0.653 
10-14 0.165 0.083 0.377 0.099 
15-19 1.913 0.872 3.615 1.007 
20-24 0.252 0.131 0.510 0.153 
25-29 0.856 0.433 1.587 0.498 
30-34 0.536 0.250 1.151 0.299 
35-39 0.240 0.186 0.868 0.239 
40-44 0.787 0.373 1.522 0.424 
45-49 0.254 0.126 0.436 0.145 
50-54 0.597 0.311 1.290 0.367 
55-59 0.196 0.151 0.652 0.187 
60-64 0.247 0.170 0.803 0.212 
65-69 0.207 0.119 0.604 0.147 
70-74 0.294 0.171 0.739 0.202 
75-719 0.230 0.117 0.485 0.133 
80-84 0.346 0.187 0.835 0.227 
85-89 0.178 0.099 0.482 0.124 
90-94 0.307 0.153 0.701 0.186 
95-99 0.071 0.042 0.220 0.051 
4 ; 
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Fig. 1. Box-plot of the residuals’variability for each age group, starting from 0 up to 95-99 
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Fig. 2. x; resulting from different experimental conditions: the age group (on the rows) and the 
different percentiles (on the columns) effect 


the experiment in the four rows and the successive increment in the percentage of outer 
values which have been transformed in the 6 columns. We can notice the different x; 
behaviour in the four rows as more age groups and percentiles are considered. 

For better interpretation of these results, we have plotted a synthetic view of the 
resulting average of the 1000 x; under the 24 conditions (see Fig. 3) and compared 
them with the series derived by the traditional LC estimation. 


Fig. 3. A comparison between the 24 averaged x; (in red) and the original one (in black) 
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In Figure 3, where on the x-axis there are the years from 1950 to 2000 and on the 
y-axis there are the x; values, we represent the 24,000 x, grouped according to the 
24 different experimental conditions. We can observe the impact on the x; series of 
the age groups change and of the increase of percentage of random values considered 
in the selected age groups. We can notice that the x; derived by the experiment (in 
red) tends to be flatter than the original one (drawn in black), i.e., there are changes 
in homogeneity on the x; for each of the four age groups. By comparing the ordinary 
kK; to the simulated ones, we obtain information about the effect of the lack of ho- 
moschedasticity on the LC estimates. To what extent does it influence the sensitivity 
of the results? We note that the more homogenous the residuals are, the flatter the x; 
is. From an actuarial point of view, the x; series reveals an important result: when we 
use the new x; series to generate life tables, we find survival probabilities lower than 
the original ones. The effect of that on a pension annuity portfolio will be illustrated 
in the following application. 


5 Numerical illustrations 


In this section, we provide an application of the previous procedure for generating 
survival probabilities by applying them to a pension annuity portfolio in which bene- 
ficiaries enter the retirement state at the same time. In particular, having assessed the 
breaking of the homoschedasticity hypothesis in the Lee-Carter model, we intend to 
quantify its impact on given quantities of interest of the portfolio under consideration. 
The analysis concerns the dynamic behaviour of the financial fund checked year by 
year arising from the two flows in and out of the portfolio, the first consisting in the 
increasing effect due to the interest maturing on the accumulated fund and the second 
in the outflow represented by the benefit payments due in case the pensioners are still 
alive. Let us fix one of the future valuation dates, say corresponding to time x, and 
consider what the portfolio fund is at this valuation date. As concerns the portfolio 
fund consistency at time x, we can write [2]: 


Ze =Ze-1(l+ih)+N*°P with «=1,2,---,n-1, (5) 
Ze = Ze-1 (1+ if) — N*R with k=n,n+1,---,mw—-x, (6) 


where N° represents the number of persons of the same age x at contract issue t = 0 
reaching the retirement state at the same time n, that is at the age x + n, and i isa 
random financial interest rate in the time period (k — 1, k). The formulas respectively 
refer to the accumulation phase and the annuitisation phase. 


5.1 Financial hypotheses 


Referring to the financial scenario, we refer to the interest rate as the rate of return 
on investments linked to the assets in which insurer invests. In order to compare, we 
consider both a deterministic interest rate and a stochastic interest rate framework. 
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As regards the former, we assume that the deposited portfolio funds earn at the finan- 
cial interest rate fixed at a level of 3%. As regards the latter, we adopt the Vasicek 
model [12]. This stochastic interest rate environment seems to be particularly suit- 
able for describing the instantaneous global rate of return on the assets linked to the 
portfolio under consideration, because of potential negative values. As is well known, 
this circumstance is not in contrast with the idea of taking into account a short rate 
reflecting the global investment strategy related to the portfolio [9]. 


5.2 Mortality hypotheses 


As concerns the mortality model, we consider the survival probabilities generated 
by the above-described simulation procedure (hereinafter simulation method) and by 
the classical estimation of the Lee-Carter model (traditional method). In the former 
methodology we consider the x; series arising from the experiment. Following the 
Box-Jenkins procedure, we find that an ARIMA (0,1,0) model is more feasible for 
our time series. After obtaining the «; projected series, we construct the projected life 
table and then we extrapolate the probabilities referred to insured aged x = 45. In 
Figure 4 we report the survival probability distribution as a function of different LC 
estimation methods: the traditional and the simulation methods. We can notice that 
the pattern of simulated probabilities lies under the traditional probabilities. Moreover 
this difference increases as the projection time increases. 

Thus, referring to the financial and the demographic stochastic environments 
described above, we evaluate the periodic portfolio funds. As regards the premium 
calculation hypotheses, we use two different assumptions (simulated LC, classical 
LC) and the fixed interest rate at 4%. We use the same mortality assumptions made in 
the premium calculation even for the portfolio fund dynamics from the retirement age 
on, i.e., which means to resort to a sort of homogeneity quality in the demographic 
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Fig. 4. Comparison between the two different methods for generating survival probabilities on 
the basis of the Lee-Carter model: traditional and simulation method 
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Fig. 5. Portfolio of 1000 pension annuities, x = 45, f = 20, r = 100. Fixed rate at 3% 


description, in the light of the main results of [2]. In the following graphs (see Figs. 5 
and 6) we represent the portfolio funds along with the potential whole contract life, 1.e., 
both into the accumulation phase and into the annuitisation phase. The portfolio funds 
trend is calculated on a pension annuity portfolio referred to a cohort of c = 1000 
beneficiaries aged x = 45 at time ¢ = 0 and entering in the retirement state 20 years 
later, that is at age 65. The cash flows are represented by the constant premiums P, 
payable at the beginning of each year up to t = 20 in case the beneficiary is still alive 
at that moment (accumulation phase) and by the constant benefits R = 100 payable at 
the beginning of each year after t = 20 (annuitisation phase) in case the beneficiary 
is still alive at that moment. Figure 5 shows how the portfolio funds increase with 
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Fig. 6. Portfolio of 1000 pension annuities, x = 45, tf = 20, r = 100. Stochastic rate of return 


Lee-Carter error matrix simulation 121 


better survival probabilities. In particular, in this figure is represented the portfolio 
funds earning interest, term by term, at the fixed rate of return of 3%, from the time 
issue on. As a first result we find out that the portfolio fund amount is overestimated 
when the survival probabilities are calculated on the basis of the projection of the 
traditional LC estimation. On the basis of the results reported above, we can notice 
how the lack of homoschedasticity affects the portfolio risk assessment. 

Finally, we evaluate the portfolio fund consistency from the contract issue on, 
adopting the Vasicek model for describing the instantaneous global rate of return 
on the assets linked to the portfolio under consideration. As in the previous case, 
Figure 6 shows that the traditional forecasting method blows up the portfolio funds 
amount both into the accumulation and into the annuitisation phases. Our findings are 
confirmed also in the case of the stochastic rate of return. For this reason, we provide 
evidence that the lack of homoschedasticity has a strong effect on the actuarial results. 


6 Conclusions 


The simulation procedure proposed in this paper is characterised by an experimental 
strategy to stress the fulfilment of the homoschedasticity hypothesis of the LC model. 
In particular, we simulate different experimental conditions to force the errors to 
satisfy the model hypothesis in a fitting manner. Besides, we develop the x; series for 
generating more realistic survival probabilities. Finally we measure the impact of the 
two different procedures for generating survival probabilities, using the traditional 
and simulation methods, on a portfolio of pension annuity. The applications, referred 
to the male population, show that the probabilities generated on the basis of the 
simulation procedure are lower than the probabilities obtained through the traditional 
methodology by the LC model. In particular, if we apply the simulated projections 
to a financial valuation of periodic portfolio funds of pension annuity portfolio, we 
can observe lower corresponding values than the traditional one, in both the so- 
called accumulation and annuitisation phases. Especially, we can notice more sizeable 
portfolio funds in the event of traditional methodology. In other words, the insurer’s 
financial position would be overestimated by means of the traditional method in 
comparison with the simulation method. The results of the appraisal arise from the 
different behaviours of the residuals. In fact, in the traditional methodology, we get 
heteroschedasticity in the residuals for some age groups which can lead to more 
optimistic survival projections. On the other hand, on the basis of the simulation 
procedure, the final result shows how a more regular residual matrix leads to a flatter 
x, Series according to the LC model hypothesis. This circumstance determines more 
pessimistic survival projections. 
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Estimating the volatility term structure 


Antonio Diaz, Francisco Jarefio, and Eliseo Navarro 


Abstract. In this paper, we proceed to estimate term structure of interest rate volatilities, 
finding that these estimates depend significantly on the model used to estimate the term structure 
(Nelson and Siegel or Vasicek and Fong) and the heteroscedasticity structure of errors (OLS 
or GLS weighted by duration). We conclude in our empirical analysis that there are significant 
differences between these volatilities in the short (less than one year) and long term (more than 
ten years). Finally, we can detect that three principal components explain 90% of the changes 
in volatility term structure. These components are related to level, slope and curvature. 


Key words: volatility term structure (VTS), term structure of interest rates (TSIR), GARCH, 
principal components (PCs) 


1 Introduction 


We define the term structure of volatilities as the relationship between the volatil- 
ity of interest rates and their maturities. The importance of this concept has been 
growing over recent decades, particularly as interest rate derivatives have developed 
and interest rate volatility has become the key factor for the valuation of assets such 
as caplets, caps, floors, swaptions, etc. Moreover, interest rate volatility is one of 
the inputs needed to implement some term structure models such as those of Black, 
Derman and Toy [4] or Hull and White [12], which are particularly popular among 
practitioners. 

However, one of the main problems concerning the estimation of the volatility 
term structure (VTS) arises from the fact that zero coupon rates are unobservable. 
So they must be previously estimated and this requires the adoption of a particular 
methodology. The problem of the term structure estimation is an old question widely 
analysed in the literature and several procedures have been suggested over the last 
thirty years. 

Among the most popular methods are those developed by Nelson and Siegel [14] 
and Vasicek and Fong [17]. In Spain, these methods have been applied in Nufiez [15] 
and Contreras et al. [7] respectively. 

A large body of literature focuses on the bond valuation ability of these alternative 
models without analysing the impact of the term structure estimation method on 
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second or higher moments of the zero coupon rates. Nevertheless, in this paper we 
focus on the second moment of interest rates derived from alternative term structure 
methods. So, the aim of this paper is to analyse if there are significant differences 
between the estimates of the VTS depending on the model used for estimating the 
term structure of interest rates (TSIR). 

In this study we compare Nelson and Siegel [14], NS°, Vasicek and Fong [17], 
V F° ,and both models using two alternative hypotheses about the error variance. First 
we assume homoscedasticity in the bond price errors and so does the term structure 
as estimated by OLS. Alternatively, a heteroscedastic error structure is employed 
estimating by GLS weighting pricing errors by the inverse of its duration, NS° and 
VFS. 

In the literature, to minimise errors in prices is usual in order to optimise any 
model for estimating the TSIR. Nevertheless, this procedure tends to misestimate 
short-term interest rates. This is because an error in short-term bond prices induces 
an error in the estimation of short-term interest rates greater than the error in long-term 
interest rates produced by the same error in long-term bond prices. In order to solve 
this problem, it is usual to weight pricing errors by the reciprocal of bond Macaulay’s 
duration.! 

Once estimates of TSIR are obtained, we proceed to estimate interest rate volatil- 
ities using conditional volatility models (GARCH models). 

In addition, we try to identify the three main components in the representation of 
the VTS for each model. Some researchers have studied this subject, finding that a 
small number of factors are able to represent the behaviour of the TSIR [3, 13, 15]. 
Nevertheless, this analysis has not been applied, to a large extent, to the VTS (except, 
e.g., [1]). 

We apply our methodology to the VTS from estimates of the Spanish TSIR. The 
data used in this empirical analysis are the Spanish Treasury bill and bond prices of 
actual transactions from January 1994 to December 2006. 

We show statistically significant differences between estimates of the term struc- 
ture of interest rate volatilities depending on the model used to estimate the term 
structure and the heteroscedasticity structure of errors (VS a ,NS io VF ° and VF oy. 
mainly in the short-term (less than one year) and in the long-term (more than ten years) 
volatility. This inspection could have significant consequences for a lot of issues re- 
lated to risk management in fixed income markets. On the other hand, we find three 
principal components (PCs) that can be interpreted as level, slope and curvature and 
they are not significantly different among our eight proposed models. 

The rest of our paper is organised as follows. The next section describes the 
data used in this paper and the methodologies employed to estimate the TSIR: the 
Nelson and Siegel [14], NS, and Vasicek and Fong [17], VF, models. The third section 
describes the model used to estimate the term structure of volatilities. The fourth 
section analyses the differences in the VTS from our eight different models. Finally, 
the last two sections include a principal component analysis of VTS and, finally, 
summary and conclusions. 


! This correction is usual in official estimations of the central banks [2]. 
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2 Data 


The database we use in this research contains daily volume-weighted averages of all 
the spot transaction prices and yields of all Spanish Treasury bills and bonds traded 
and registered in the dealer market or Bank of Spain’s book entry system. They are 
obtained from annual files available at the “Banco de Espafia” website.” We focus on 
27 different maturities between 1 day and 15 years. Our sample runs from January 
1994 to December 2006. 

First of all, in order to refine our data, we have eliminated from the sample those 
assets with a trading volume less than 3 million euros (500 million pesetas) in a single 
day and bonds with term to maturity less than 15 days or larger than 15 years. Besides, 
in order to obtain a good adjustment in the short end of the yield curve, we always 
include in the sample the one-week interest rate from the repo market. 

From the price (which must coincide with the quotient between effective volume 
and nominal volume of the transaction) provided by market, we obtain the yield to 
maturity on the settlement day. Sometimes this yield diverges from the yield reported 
by the market. Controlling for these conventions, we recalculate the yield using com- 
pound interest and the year basis ACT/ACT for both markets.’ 

We estimate the zero coupon bond yield curve using two alternative methods. The 
first one we use fits Nelson and Siegel’s [14] exponential model for the estimation 
of the yield curve. The second methodology is developed in Contreras et al. [7] 
where the Vasicek and Fong [17] term structure estimation method (V F O)is adapted 
to the Spanish Treasury market. VF? uses a non-parametric methodology based on 
exponential splines to estimate the discount function. A unique variable knot, which is 
located to minimise the sum of squared residuals, is used to adjust exponential splines. 

With respect to the estimation methodology we apply both OLS and GLS. In the 
second case we adjust the bond price errors by the inverse of the bond Macaulay 
duration in order to avoid penalisation of more interest rate errors in the short end of 
the term structure. 

In Figure | we illustrate the resulting estimations of the term structure in a single 
day depending on the weighting scheme applied to the error terms. It can be seen how 
assuming OLS or GLS affects mainly the estimates in the short and long ends of the 
TSIR even though in both cases we use the Nelson and Siegel model.° 


: http://www.bde.es/banota/series.htm. Information reported is only about traded issues. It 
contains the following daily information for each reference: number of transactions, settle- 
ment day, nominal and effective trading volumes, maximum, minimum and average prices 
and yields. 

3 These divergences are due to simple or compound interest and a 360-day or 365-day year 
basis depending on the security term to maturity. http://www.bde.es/banota/actuesp.pdf 

4 See, for example, Diaz and Skinner [10], Diaz et al. [8] and Diaz and Navarro [9] for a more 
detailed explanation. Also, a number of authors have proposed extensions to the NS model 
that enhance flexibility [16]. 

5 When using the Vasicek and Fong model, these differences are mainly shown in the short 
term. We observe differences depending on the model employed (VF or NS) even when the 
same error weighting scheme is used. 
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Fig. 1. TSIR estimated by NS? and NS© (01.07.1994) 


In summary, we use four different estimation models: Nelson and Siegel [14], 
NS®, and Vasicek and Fong [17], VF G which take into account residuals weighted 
by the reciprocal of maturity, and NS° and V F , that is, with non-weighted residuals. 
These alternative estimation procedures provide the input of the subsequent functional 
principal component analysis. 


3 GARCH models 


VTS is an essential issue in finance, so it is important to have good volatility forecasts, 
which are based on the fact that volatility is time-varying in high-frequency data. In 
general, we can assume that there are several reasons to model and forecast volatility. 
First of all, it is necessary to analyse the risk of holding an asset® and the value of 
an option which depends crucially of the volatility of the underlying asset. Finally, 
more efficient estimators can be obtained if heteroscedasticity in the errors is handled 
properly. 

In order to achieve these forecasts, extensive previous literature has used autore- 
gressive conditional heteroscedasticity (ARCH) models, as introduced by Engle [11] 
and extended to generalized ARCH (GARCH) in Bollerslev [5]. These models nor- 
mally improve the volatility estimates, to a large extent, compared with a constant 
variance model and they provide good volatility forecasts, so they are widely used 
in various branches of econometrics, especially in financial time series analysis. In 
fact, it is usually assumed that interest rate volatility can be accurately described by 
GARCH models. 


6 In fact, VaR estimates need as the main input the volatility of portfolio returns. 
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Taking into account a great variety of models (GARCH, ARCH-M, TGARCH, 
EGARCH ...), we identify the best one for each estimate of the TSIR: Nelson and 
Siegel (VS 0), Vasicek and Fong (VF ?) and both models weighted by duration (VS G 
and VF®), using Akaike Information Criterion (AIC). We select the ML aproach for 
estimating the GARCH parameters.’ In particular, GARCH models fit very well when 
we use NS? and V F®. Nevertheless, T-GARCH and E-GARCH seem to be the best 
models for VF° and NS° estimations, respectively. 


4 Differences in the volatility from different models 


In this section we study the differences between the volatility term structure from dif- 
ferent estimation models of the TSIR (VS°, VF9, NS and V F®) and conditional 
volatility models (GARCH models in each previous case). In the first type of model, 
we obtain the historical volatility using 30-, 60- and 90-day moving windows and the 
standard deviation measure. We show the results with a 30-day moving window. 

As a whole we can see a repeating pattern in the shape of the VTS: initially 
decreasing, then increasing until one to two years term and finally we can observe a 
constant or slightly decreasing interest rate volatility as we approach the long term 
of the curve. This is consistent with Campbell et al. [6], who argue that the hump of 
the VTS in the middle run can be explained by reduced forecast ability of interest 
rate movements at horizons around one year. They argue that there is some short- 
run forecastability arising from Federal Reserve operating procedures, and also some 
long-run forecastability from business-cycle effects on interest rates. 

At first glance, volatility estimates for the different models used to estimate the 
interest rate term structure reveal how the methodology employed to estimate zero 
coupon bonds may have an important impact, both in level and shape, on the subse- 
quent estimate of the VTS. This can be more clearly seen in Figure 2, where we show 
the VTS for our 8 cases on some particular days: 
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Fig. 2. Volatility Term Structure (VTS) among different models 


7 The selected model for each maturity and estimation model of the TSIR is available, but we 
do not exhibit these results so as to lighten the article. 
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In order to improve our analysis, we proceed to measure the average differences 
between volatility estimates using two alternative and different methods. We can 
detect that these differences seem to be higher in the short term (less than one year) 
and in the long term (more than ten years). Finally, we use some statistics to test 
whether volatility series have the same mean, median and variance (Table 1). In order 
to perform this analysis, we obtain an Anova-F test for the mean analysis, Kruskal- 
Wallis and van der Waerden test for the median analysis and, finally, a Levene and 
Brown-Forsythe test for analysing the significance of the VTS variance. 


Table 1. Tests of equality of means, medians and variances among different models for each 
maturity 


Maturity (years) 


Test 0.25 0.5 0.75 1 3 5 10 12 15 
F 477.8088° — 254.7131° — 97.67177° 27.64625 © 0.653847 0.305357 2.175614” 5.938809° 175.7461 ° 
K-W 4349.893°  2636.512° 1151.682° 433.3387 ° 7.505526 3.589879 6.862232 44.18141° 1141.098° 
vW 4454.727°  2607.419° 1100.201 © 379.1995 © 8.914100 4.463184 11.93060 55.15105° 1170.865 © 
iL 194.7067"  80.67102° =. 20.38274° 4.522192 ° 0.106095 0.259973 4.682544° 6.543890 165.7889 © 


B-F 145.3684 ° 58.94565 © 14.20114° 2.158965" 0.092483 0.217367 2.481528" 3.134396 ° 91.89411 © 
4p <0.10, > p< 0.05, °p<0.01 
F: Anova-F Test, K-W: Kruskal-Wallis Test, vW: van der Waerden Test, L: Levene Test, B-F: Brown-Forsythe Test 


On the one hand, statistics offer evidence against the null hypothesis of homo- 
geneity for the shorter maturities (below to 1 year) and also for the longer maturities 
(more than 10 years), in mean and median. On the other hand, statistics to test for 
whether the volatility produced by the eight models has the same variance show the 
same results as mean and median analysis, that is, we find evidence against the null 
hypothesis for the shorter and longer maturities. 

To summarise, this analysis shows that volatility estimates using different models 
and techniques display statistically significant differences, mainly in the shorter and 
longer maturities, as would be expected. 


5 A principal component analysis of 
volatility term structure (VTS) 


In this section, we try to reduce the dimensionality of the vector of 27 time series 
of historical/conditional volatilities, working out their PCs, because this analysis is 
often used to identify the key uncorrelated sources of information. 

This technique decomposes the sample covariance matrix or the correlation matrix 
computed for the series in the group. The row labelled “eigenvalue” in Table 2 reports 
the eigenvalues of the sample second moment matrix in descending order from left to 
right. We also show the variance proportion explained by each PC. Finally, we collect 
the cumulative sum of the variance proportion from left to right, that is, the variance 
proportion explained by PCs up to that order. The first PC is computed as a linear 


8 Note that we analyse volatility changes (see, for example, [3]). 
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Table 2. Main results of the principal component analysis 


NS° NS° VF° VF° GNS° GNS° GVF° GVF° 
Historical Volatility Conditional Volatility 
First Principal Component 
Eigenvalue 14.47963 = 14.42727 —-12.50790 = -14.95705 15.11595 14.52248 13.30651 15.23433 
Var. prop. 0.536283 0.534343 0.463255 0.553965 | 0.559850 0.537870 0.492834 0.564234 
Cum. prop. | 0.536283 0.534343 (0.463255 0.553965 | 0.559850 0.537870 0.492834 0.564234 
Second Principal Component 
Eigenvalue 8.191949 7.305261 = 7.484512 6.623200 | 7.767501 7.520611 7.930251 6.769219 
Var. prop. 0.303406 = 0.270565 0.277204 ~=-0.245304 | 0.287685 0.278541 0.293713 0.250712 
Cum. prop. | 0.839688 0.804909 0.740460 0.799268 | 0.847535 0.816411 0.786547 0.814946 
Third Principal Component 
Eigenvalue 2.440719 2.549777 2.997161 2.321565 | 2.149763 2.366136 2.400861 2.120942 
Var. prop. 0.090397 0.094436 0.111006 0.085984 | 0.079621 0.087635 0.088921 0.078553 
Cum. prop. | 0.930085 0.899345 0.851466 0.885252 | 0.927156 0.904045 0.875467 0.893499 
Fourth Principal Component 
Eigenvalue 1.216678 =—-1.318653, 2.161253 1.388741 1.234067 1.270866 1.866241 1.237788 
Var. prop. 0.045062 0.048839 0.080046 0.051435 | 0.045706 0.047069 0.069120 0.045844 
Cum. prop. | 0.975147 0.948184 0.931512 0.936687 | 0.972862 0.951114 0.944588 0.939343 
Fifth Principal Component 
Eigenvalue 0.500027 0.711464 0.755749 0.812430 | 0.473576 0.690566 0.677145 0.788932 
Var. prop. 0.018520 0.026351 0.027991 0.030090 | 0.017540 0.025577 0.025079 0.029220 
Cum. prop. | 0.993667 0.974534 0.959503 0.966777 | 0.990402 0.976691 0.969667 0.968563 


G-before the name of the model indicates that we have used a GARCH model 


combination of the series in the group with weights given by the first eigenvector. The 
second PC is the linear combination with weights given by the second eigenvector 
and so on. 

We can emphasise the best values for the percentage of cumulative explained 
variance for each PC: 56% in case of GV F® (first PC), 84% in case of GNS? 
(second PC) and 93% (third PC), 97% (fourth PC) and 99% (fifth PC) in case of 
NS°. Thus, the first five factors capture, at least, 97% of the variation in the volatility 
time series. 

In this section, we can assert that the first three PCs are quite similar among dif- 
ferent models. Particularly, the first PC keeps quasi constant over the whole volatility 
term structure (VTS) and the eight models. So, we can interpret it as the general level 
of the volatility (level or trend). With respect to the second PC, it presents coeffi- 
cients of opposite sign in the short term and coefficients of the same sign in the long 
term, so this component can be interpreted as the difference between the levels of 
volatility between the two ends of the VTS (slope or tilt). Finally, the third PC shows 
changing signs of the coefficients, so this PC could be interpreted as changes in the 
curvature of the VTS (curvature). So, an important insight is that the three factors 
may be interpreted in terms of level, slope and curvature. 

With regard to the fourth and fifth PC, they present some differences among each 
model; nevertheless, these PCs can be related with higher or lower hump of the VTS. 

In order to finish this analysis, we want to test whether the first three PCs, which 
clearly reflect level, slope and curvature of the VTS, and the last two PCs are different 
among our eight models (historical and conditional volatilities). 

Considering the results from Table 3, we can assert that statistics related to dif- 
ferences in mean evidence homogeneity in mean for our eight models as we cannot 
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Table 3. Tests of equality of means, medians and variances among different models 


TEST PCl PC2 PC3 PC4 PCS 
F 0.012749 0.056012 0.020015 0.179951 0.024021 
K-W 1.016249 2.452214 3.810190 11.82140 55.13159° 
vW 0.518985 0.795634 2.032438 8.070648 45.21040 © 
L 4.033919°  23.92485° 16.57419° 66.74642°  67.33491° 


B-F 4.064720°  23.87826° 16.51119° 65.80991° 67.06584° 
*p<0.10, p< 0.05, * p< 0.01 
F: Anova-F Test, K-W: Kruskal-Wallis Test, vW: van der Waerden Test, L: Levene Test, 
B-F: Brown-Forsythe Test 


reject the null hypothesis. In case of differences in median, we find evidence against 
the null hypothesis of equal medians for the fifth PC. Nevertheless, the other PCs 
offer evidence in favour of the null hypothesis. 

On the other hand, statistics to test whether the PC variance produced by our eight 
models is the same or not also appear in Table 3. For all the PCs, these statistics offer 
strong evidence against the null hypothesis. 

Summarising, in this section we have concluded that the first three PCs can be 
related to level, slope and curvature of the VTS and, besides, these PCs are not 
significantly different in mean and median among our eight models. Nevertheless, 
PC4 and PCS are significantly different between our models. 


6 Conclusions 


This paper aims to provide new insights into the behaviour of the VTS of interest rates 
by using historical volatility estimates from four different models of the term structure 
of interest rate (TSIR) and applying alternative conditional volatility specifications 
(using GARCH models) from 1994 to 2006. We have used the mentioned models, 
and we have worked out the volatility time series using 30-, 60- and 90-day moving 
windows in order to construct the VTS. 

First of all, the results of our analysis show that there are statistically significant 
differences between estimates of the term structure of interest rate volatilities de- 
pending on the model used to estimate the term structure and the heteroscedasticity 
structure of errors (NS°, NS°, VF? and VF®), mainly in the short term (less than 
one year) and in the long term (more than ten years), but these differences do not 
depend on procedures to estimate the VTS. Secondly, the previous evidence suggests 
that the dynamics of term structures of volatilities can be well described by relatively 
few common components. The possible interpretation of these principal components 
in terms of level, slope and curvature can describe how the VTS shifts or changes 
shape in response to a shock on a PC. 

We find that the first three PCs are quite similar among different models and they 
can be identified as trend, tilt and curvature. Regarding the fourth and fifth PCs, they 
can be related with higher or lower hump of the VTS. Also, the first three PCs are not 
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significantly different in mean and median among our eight models. Nevertheless, 
PC4 and PCS are significantly different between our models. 
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Exact and approximated option pricing in a stochastic 
volatility jump-diffusion model 


Fernanda D’Ippoliti, Enrico Moretto, Sara Pasquali, and Barbara Trivellato 


Abstract. We propose a stochastic volatility jump-diffusion model for option pricing with 
contemporaneous jumps in both spot return and volatility dynamics. The model admits, in the 
spirit of Heston, a closed-form solution for European-style options. To evaluate more complex 
derivatives for which there is no explicit pricing expression, such as barrier options, a numerical 
methodology, based on an “exact algorithm” proposed by Broadie and Kaya, is applied. This 
technique is called exact as no discretisation of dynamics is required. We end up testing the 
goodness of our methodology using, as real data, prices and implied volatilities from the DJ 
Euro Stoxx 50 market and providing some numerical results for barrier options and their Greeks. 


Key words: stochastic volatility jump-diffusion models, barrier option pricing, rejection sam- 
pling 


1 Introduction 


In recent years, many authors have tried to overcome the Heston setting [11]. This 
is due to the fact that the ability of stochastic volatility models to price short-time 
options is limited [1, 14]. In [2], the author added (proportional) log-normal jumps to 
the dynamics of spot returns in the Heston model (see [10] for log-uniform jumps) and 
extended the Fourier inversion option pricing methodology of [11, 15] for European 
and American options. This further improvement has not been sufficient to capture 
the rapid increase of volatility experienced in financial markets. One documented 
example of this feature is given by the market stress of Fall 1987, when the volatility 
jumped up from roughly 20 % to over 50 %. To fill this gap, the introduction of jumps in 
volatility has been considered the natural evolution of the existing diffusive stochastic 
volatility models with jumps in returns. In [9], the authors recognised that “although 
the motivation for jumps in volatility was to improve on the dynamics of volatility, the 
results indicate that jumps in volatility also have an important cross-sectional impact 
on option prices”. 

In this context, we formulate a stochastic volatility jump-diffusion model that, in 
the spirit of Heston, admits a closed-form solution for European-style options. The 
evolution of the underlying asset is driven by a stochastic differential equation with 
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jumps that contains two diffusion terms: the first has constant volatility, as in the Black 
and Scholes (B&S) model [4], while the latter is of the Heston type. The dynamics 
of the volatility follow a square-root process with jumps. We suppose that the arrival 
times of both jumps are concurrent, hence we will refer to our model as a stochastic 
volatility with contemporaneous jumps (SVCJ) model. We claim that two diffusion 
terms in the dynamics of spot returns make our model more flexible than the Heston 
one. 

Valuation of non-European options usually requires numerical techniques; in most 
cases some kind of discretisationis necessary so that a pricing bias is present. To avoid 
this flaw, we opt for the “exact simulation” approach developed by Broadie and Kaya 
(B&K) [5,6] for stochastic volatility and other affine jump-diffusion models. This 
method is based on both a Fourier inversion technique and some conditioning argu- 
ments so to simulate the evolution of the value and the variance of an underlying 
asset. Unlike B&K’s algorithm, to determine the integral of the variance, we replace 
the inverse transform method with a rejection sampling technique. We then compare 
the results of the closed-form expression for European-style option prices with their 
approximated counterparts using data from the DJ Euro Stoxx 50 derivative market. 
Having found that the modified algorithm returns reliable values, we determine prices 
and Greeks for barrier options for which no explicit formula exists. 


2 Stochastic volatility jump-diffusion model 


Let (Q, F, Q) be a complete probability space where Q is a risk-neutral probability 
measure and consider t € [0, 7]. We suppose that a bidimensional standard Wiener 
process W = (W 1, W2) and two compound Poisson processes Zs and Z, are defined. 
We assume that W], W2, Zs and Z, are mutually independent. We suppose that 


dS(t) = S(t7) [« — Ajs) dt +05 dWi(t) + é/v(t-) dWo(t) + dZs(t)| . (dd) 
dv(t) = k* (0* — v(t )) dt + ay Vv(t-) dWa(t) + dZ,(t), (2) 


where S(t) is the underlying asset, ./v(f) is the volatility process, and parameters 
r, os, €, k*, @* and o, are real constants (r is the riskless rate). The processes Z s(t) 
and Z,(t) have the same constant intensity 2 > 0 (annual frequency of jumps). The 
process Z(t) has log-normal distribution of jump sizes; if Js is the relative jump size, 
then log(1+ Js) is distributed according to the V (log(1 + js) - 50, 53) law, where 
js is the unconditional mean of Js. The process Z, (t) has an exponential distribution 
of jump sizes J, > 0 with mean j,. Note that Js € (—1, +00) implies that the stock 
price remains positive for all t € [0, 7]. The variance v(t) is a mean reverting process 
with jumps where k*, @* and oy are, respectively, the speed of adjustment, the long- 
run mean and the variation coefficient. If k*, 0*, a) > 0, 2k*0* > oe, v(0) > O and 
J, > 0, then the process v(t) is positive for all t € [0, T] with probability 1 (see [12] 
in the no-jump case) and captures the large positive outliers in volatility documented 
in [3]. Jumps in both asset price and variance occur concurrently according to the 
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counting process N (ft). The instantaneous correlation between S and v, when a jump 
does not occur, is p(t) = €,/v(t)/ (o§ + &o (t)) , depends on two parameters and is 
stochastic because it contains the level of volatility inz. We claim that this improves the 
Heston model in which correlation between the underlying and volatility is constant. 
Further, € in p(t) gauges the B&S constant volatility component (o2) with the one 
driven by v(t) (see [7]). Lastly, the instantaneous variance of returns os + &y (t) 
is uniformly bounded from below by a positive constant, and this fact proves to be 
useful in many control and filtering problems (see [13]). 


3 Closed formula for European-style options 


By analogy with B&S and Heston formule, the price of a call option with strike price 
K and maturity T written on the underlying asset S' is 


C(S, v, t) = SP\(S,v, t)-— Ke"? P)(S, v, 1), (3) 


where P; (S,v,t), 7 = 1, 2, are cumulative distribution functions (cdf). In particular, 
Pj(z) := Pj(e*), z € R, j = 1,2, are the conditional probabilities that the call 
option expires in-the-money, namely, 


P (log S, v, t; log K) = Q{log S(T) > log K| log S(t) = S, v(t) =v}. (4) 
Using a Fourier transform method one gets 


Pj (log S, v, t; log K) = 


1 17° ei los KG (og S,v, t; u1, 0 
staf 2 (ee ed (5) 


iu} 
where 7(z) denotes the real part of z € C, and 9; (log S, v, t; u1,u2), j = 1,2, are 
characteristic functions. Following [8] and [11], we guess 


pj (Y, 0, t; uj, U2) = 
exp[Cj(t; u1,u2) + Jj(t3 ui,u2) + Dj(t3 ui, u2)v+iviY], (6) 


where Y = log S,t = T—t and j = 1, 2. The explicit expressions of the characteristic 
functions are obtained to solutions to partial differential equations (PDEs) (see [7] for 
details); densities p;(Y, 0, t; log K) of the distribution functions F; (Y,v,t; log K) = 
1— P;(¥, 0, t; log K) are then 


pj(Y, 0, t; log K) = 


1 ie ; 
-- | R (=e EK 9 (Y, v, 1511, 0)) due, j=l,2. (7) 
x JO . 
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4 Generating sample paths 


Following [5,6], we now give a Monte Carlo simulation estimator to compute option 
price derivatives without discretising processes S and v. The main idea is that, by 
appropriately conditioning on the paths generated by the variance and jump processes, 
the evolution of the asset price can be represented as a series of log-normal random 
variables. This method is called Exact Simulation Algorithm (ESA) for the SVCJ 
Model. In Step 3 of this method, the sampling from a cdf is done through an inverse 
transform method. Since the inverse function of the cdf is not available in closed form, 
the authors apply a Newton method to obtain a value of the distribution. To avoid the 
inversion of the cdf, we use a rejection sampling whose basic idea is to sample from 
a known distribution proportional to the real cdf (see [7]). This modification involves 
an improvement of efficiency of the algorithm, as the numerical results in Table 2 
show. 
To price a path-dependent option whose payoff is a function of the asset price 
vector (S(to), ..., S(ty)) (M = 1 for a path-independent option), let 0 = to < t < 
. <ty =T bea partition of the interval [0, T] into M possibly unequal segments 
of length Az; := ¢; — tj-1, fori = 1,..., M. Now consider two consecutive time 
steps tj; and f; on the time grid and assume v (t;_ 1) is known. The algorithm can be 
summarised as follows: 


Step 1. Generate a Poisson random variable with mean 4 At; and simulate n;, the 
number of jumps. Let 7;,1 be the time of the first jump after ¢;_1. Set wu := tj-| 
and t := 1,1 (u < ft). Ift > t;, skip Steps 5. and 6. 

Step 2. Generate a sample from the distribution of v(t) given v(u), that is, a non- 
central chi-squared distribution. 

Step 3. Generate a sample from the distribution of es v(q)dq given v(u) and v(t): 
this is done by writing the conditional characteristic function of the integral 
and then the density function. We simulate a value of the integral applying 
the rejection sampling. 

Step 4. Recover [' /o(q)dW2(q) given v(u), v(t) and f" v(q)dq. 

Step 5. Ift < t;, generate J, by sampling from an exponential distribution with mean 
jp. Update the variance value by setting v(t) = v(t) + ge, where cg is the 
first jump size of the variance. 

Step 6. Ift < ¢;, determine the time of the next jump 7;,2 after 7;,. If ti,2 < t;, set 
u i= Tj,1 and t := 71;,2. Repeat the iteration Steps 2—5. up to #;. If t1,2 > ti, 
set u := 1;,1 and t := t;. Repeat once the iteration Steps 2-4. 

Step 7. Define the average variance between t;— and fj as 


_ nids + a5 Ati 
a; = —>_>_., 


: At; (8) 


and an auxiliary variable 


2). age . 
pe erilog(tis)-Ais Mti— fr! , v(gddaté fi, VO@DaW2(g) (9) 
a . 
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Using (8) and (9), the value S(t;) given S(t;-1) can be written as 


=) 
S(t;) = S(j-1) Bi exp I(- = “) At; +0; wi (10) 


where R ~ N(0, 1), hence S(¢;) is a lognormal random variable. 


5 Barrier options and their Greeks 


To price barrier options, we choose to apply the conditional Monte Carlo (CMC) 
technique, first used in finance in [16]. This method is applicable to path-dependent 
derivatives whose prices have a closed-form solution in the B&S setting. It exploits 
the following variance-reducing property of conditional expectation: for any random 
variables X and Y, var[E[X|Y]] < var[X], with strict inequality excepted in trivial 
cases. 

Now, we illustrate the CMC method for discrete barrier options. Let C(S(0), 
K,r, T, 0) denote the B&S price of a European call option with constant volatility 
o, maturity 7, strike K, written on an asset with initial price S(O). The discounted 
payoff for a discrete knock-out option with barrier H > S(O) is given by 


$f) =e (SL) — KY Umaxicey 8G) < Hs (11) 


where S(¢;) is the asset price at time f; for a time partition0 = f9 < t) <...<ty= 
T. Using the law of iterated expectations, we obtain the following unconditional price 
of the option 


E [er sr) = Ouse er eer si) <h)| 


E. T 
— E E jee (S(T) _ K)yt Tpmaxi cic sop<mi| f vigaa. | v(q)dW2(q), 1s]] 
= E [C (SO)~mu, K, r, T, om) Vimaxjcicu sii) <H)] > (12) 


where o y and fy are defined in (8) and (9), respectively. 

This approach can also be used to generate an unbiased estimator for delta, gamma 
and rho, exploiting the likelihood ratio (LR) method. 

Suppose that p € R” is a vector of parameters with probability density gp(X), 
where X is a random vector that determines the discounted payoff function f (X) 
defined in (11). The option price is given by 


a(p)=E[f(X)], (13) 


and we are interested in finding the derivative a’(p). From (13), one gets 


g', (x) yy 
mall at 


a(n) = ELFCOl= | fo)L plod = E [reo 
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The expression f (X) 2 0 is an unbiased estimator of a'(p) and the quantity ion is 
called score function. N ote that this latter does not depend on f (X) and that ihe Greek 
for each option is computed according to which quantity is considered a parameter 
in the expression of g. 

Consider a discrete knock-out barrier option whose payoff is given by (11). 
From (14), it follows that the LR estimator for the option Greeks are given by the 
product of f(X) and the score function. The score function is determined by using 
the key idea of CMC method: by appropriately conditioning on the paths generated 
by the variance and jump processes, the evolution of the asset price S is a log-normal 
random variable (see (10)), hence its conditional density is 


g(x) = soi (x)), (15) 


where @; is defined in (8), #(-) is the standard normal density function and 


log (scan) ~ (7 — 452) A1; 


dj(x) = 
(x) an 


(16) 


Now, to find the estimator of delta and gamma, i.e., the first and the second derivative 
with respect to the price of the underlying asset, respectively, we let p = S in (14) 
and compute the derivative of g in $(0). After some algebra, we have 


(22) _ Gio Gi) 
0S Jsise)  xSO)F? At 


(17) 


Dividing this latter by g(x) and evaluating the expression at x = S(t), we have the 
following score function for the LR delta estimator 


d\ 
S(O)a 17 Att : 


where dj is defined in (16), and oj; in (8). The case of the LR gamma estimator is 
analogous. The estimator of delta is given by 


(18) 


_ d\ 
SST RY Vie oe (as): 19 
(S(T) Y" Umax) cic S(t) <H} SOm Jan (19) 


and the estimator of gamma is 


A d> — d\o,/ At, — 1 
eT (S(T) — K)*Upmaxicicn swan ( ager . (20) 


S2O)at An 


To compute the estimator of rho, it is sufficient to compute the derivative of g with 
respect to r, 


M 
= di At; 
°F (S(T) — K)* Apmarrcren on 1 ‘). (21) 


i=l : 
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6 Numerical results 


In this section, we apply our model to the DJ Euro Stoxx 50 market (data provided by 
Banca IMI, Milan), using the set of parameters reported in Table 1. These parameters 
have been chosen in order to test the efficiency of our algorithm and obtain a good 
approximation of market volatilities. Model calibration is beyond the scope of this 
paper and is left for further research. 


Table 1. Values of parameters of the models (1) and (2) 
o* k* oS Op é A js ds Jo 
0.175 0.25 0.08 0.2 —0.4 0.05 0.025 0.02 0.03 


The Dow Jones Euro Stoxx 50 (DJ50) ‘blue-chip’ index covers the fifty EuroZone 
largest sector leaders whose stocks belong to the Dow Jones Euro Stoxx Index. DJ50’s 
option market is very liquid and ranges widely in both maturities (from one month to 
ten years) and strike prices (moneyness from 90% up to 115%). It is worth noting that 
indexes carry dividends paid by companies so that a dividend yield d has to be properly 
considered by subtracting it from the drift term in the dynamics of S. Volatilities in 


Table 2 (column 2) represent the term , [os + &2p(0) (the instantaneous variance of 
spot return at ¢ = 0, and not simply os as in the B&S model), where v (0) is the initial 
value of the stochastic volatility dynamics. It follows that we can obtain v(0) from 
v(0) = (Cine - 03) /é*, where oy xr is the market volatility. 


Exact versus approximated pricing 


We present some numerical comparisons of the ESA described in Section 4 and other 
simulation methods. For this purpose, we use European call options on November 23, 
2006; relevant data are shown in Table 2. We compare prices derived with different 
methods: the closed formula (3) (column 4), the ESA modified with the rejection 
sampling (column 5), the ESA proposed in [5,6] (column 6), and a Monte Carlo 
estimator (see (5) in [6]) (column 7). For the ESAs, we simulate 100,000 variance 
paths and 1000 price paths conditional on each variance path and jumps. Prices in 
column 5 are very similar to those obtained with the closed formula (3) and improve 
the approximation obtained using the original ESA. Our results also confirm that ESA 
is more efficient than a standard Monte Carlo approach, as stated in [5,6]. The time 
needed to obtain each price with the ESA in column 5 is about 545 seconds with a 
FORTRAN code running on an AMD Athlon MP 2800+, 2.25 GHz processor. This 
computational time is shorter than that reported in [5,6] for a comparable number of 
simulations. 

This is an encouraging result for pricing options that do not have closed-form 
formule such as barrier options and Greeks. 
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Table 2. Comparison among prices of European options with spot price S(0) = 4116.40, time 
to maturity 1 year, riskless rate r = 3.78% and dividend yield d = 3.37% on November 23, 
2006 


Moneyness % Strike mkt vol Closed ESA ESA MC 
K OMKT formula (3) (rejection) (B&K) price 
90.0 3704.76 0.1780 530.43 530.65 531.18 529.42 
95.0 3910.58 0.1660 383.14 383.65 383.31 382.38 
97.5 4013.49 0.1600 316.19 316.61 316.76 315.50 
100.0 4116.40 0.1550 255.95 255.98 256.22 255.29 
102.5 4219.31 0.1500 201.77 201.45 201.42 201.07 
105.0 4322.22 0.1450 154.22 154.62 153.92 153.43 
110.0 4528.04 0.1380 83.71 83.45 83.32 82.60 


115.0 4733.86 0.1320 40.13 40.03 38.84 38.59 


Valuation of barrier options and Greeks 


We provide some numerical results on the valuation of discrete “up-and-out” barrier 
call options whose prices are given by (12), with M = 2 monitoring times, and 
two different barriers H = {5000; 5500}. To have a sort of benchmark, we use the 
same data as the European case. Barrier option prices are computed simulating 1200 
volatility paths and 40 price paths conditional on each variance path and jumps. 

By comparing the prices of European and barrier options (H = 5000) with the 
same moneyness (see Tables 2 and 3), the relative change in price ranges from 24% 
(moneyness 90%) to 70% (moneyness 115%). The higher the moneyness, the less 
likely the option will expire with a positive payoff, either because the underlying hits 
the barrier before the maturity, knocking-down the option, or because the option is, 
at expiration, out-of-the-money. This feature is also present when the barrier level 
changes, as in Table 3. 


Table 3. Prices of barrier options with two different barriers, spot price S(0) = 4116.40, time 
to maturity 1 year, riskless rate r = 3.78% and dividend yield d = 3.37% on November 23, 
2006 


Moneyness % Strike K mkt voloyKT H = 5000 AH =5500 
90.0 3704.76 0.1780 415.0204 519.1443 
95.0 3910.58 0.1660 298.2722 374.2468 
97.5 4013.49 0.1600 239.5962 308.4141 

100.0 4116.40 0.1550 189.5186 249.8215 

102.5 4219.31 0.1500 142.6677 196.8798 

105.0 4322.22 0.1450 103.9032 149.7763 

110.0 4528.04 0.1380 45.5808 79.8648 


115.0 4733.86 0.1320 12.1330 35.8308 
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Finally, Table 4 reports delta and gamma for European and barrier options for 
different strikes. The overall time required to obtain each barrier option price (Table 3) 
along with its Greeks (Table 4) is about 1600 seconds. 


Table 4. Simulation estimates of Greeks for European and barrier options with the following 
option parameters: barrier H = 5000, spot price $(0) = 4116.40, time to maturity T —t = 1 
(year), riskless rate r = 3.78% and dividend yield d = 3.37% on November 23, 2006 


Moneyness Strike Delta Delta Gamma Gamma 

% K (European) (barrier) (European) (barrier) 
97.50 4013.49 0.61919 0.263209 0.00054727 0.0006277 

100.00 4116.40 0.56006 0.241053 0.000603 12 0.0005 138 

102.50 4219.31 0.49546 0.210052 0.00065031 0.0003766 

105.00 4322.22 0.42850 0.182986 0.00068271 0.0002149 


7 Conclusions 


An alternative stochastic volatility jump-diffusion model for option pricing is pro- 
posed. To capture all empirical features of spot returns and volatility, we introduce a 
jump component in both dynamics and we suppose that jumps occur concurrently. This 
pricing model admits, in the spirit of Heston, a closed-form solution for European- 
style options. To evaluate path-dependent options, we propose a modified version of 
the numerical algorithm developed in [5,6] whose major advantage is the lack of 
discretisation bias. In particular, we replace the inversion technique proposed by the 
authors with a rejection sampling procedure to improve the algorithm efficiency. We 
firstly apply our methodology to price options written on the DJ Euro Stoxx 50 index, 
and then we compare these prices with values obtained applying the closed-form ex- 
pression, the Broadie and Kaya algorithm and a standard Monte Carlo simulation (see 
Table 2). The numerical experiments confirm that prices derived with the ESA modi- 
fied by the rejection sampling provide the most accurate approximation with respect 
to the closed formula values. On the basis of this result, we perform the valuation of 
barrier options and Greeks whose values cannot be expressed by explicit expressions. 
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A skewed GARCH-type model for multivariate 
financial time series 


Cinzia Franceschini and Nicola Loperfido 


Abstract. Skewness of a random vector can be evaluated via its third cumulant, i.e., a ma- 
trix whose elements are central moments of order three. In the general case, modelling third 
cumulants might require a large number of parameters, which can be substantially reduced if 
skew-normality of the underlying distribution is assumed. We propose a multivariate GARCH 
model with constant conditional correlations and multivariate skew-normal random shocks. 
The application deals with multivariate financial time series whose skewness is significantly 
negative, according to the sign test for symmetry. 


Key words: financial returns, skew-normal distribution, third cumulant 


1 Introduction 


Observed financial returns are often negatively skewed, i.e., the third central moment 
is negative. This empirical finding is discussed in [3]. [7] conjectures that negative 
skewness originates from asymmetric behaviour of financial markets with respect 
to relevant news. [6] conclude that \Skewness should be taken into account in the 
estimation of stock returns”. 

Skewness of financial returns has been modelled in several ways. [12] reviews 
previous literature on this topic. [5] models skewness as a direct consequence of the 
feedback effect. [4] generalises the model to the multivariate case. 

All the above authors deal with scalar measures of skewness, even when they 
model multivariate returns. In this paper, we measure skewness of a random vector 
using a matrix containing all its central moments of order three. More precisely, we 
measure and model skewness of a random vector using its third cumulant and the 
multivariate skew-normal distribution [2], respectively. It is structured as follows. 
Sections 2 and 3 recall the definition and some basic properties of the multivariate 
third moment and the multivariate skew-normal distribution. Section 4 introduces a 
multivariate GARCH-type model with skew-normal errors. Section 5 introduces a 
negatively skewed financial dataset. Section 6 applies the sign test for symmetry to 
the same dataset. Section 7 contains some concluding remarks. 
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2 Third moment 


The third moment of a p-dimensional random vector z is defined as 3 (z) = 
E (z @z'@ rane where ® denotes the Kronecker (tensor) product and third mo- 
ment is finite [9, page 177]. The third central moment is defined in a similar way: 
23 (z) = 43 (Z — /#), where yw denotes the expectation of z. For a p-dimensional ran- 
dom vector the third moment isa p x p* matrix containing p(p+1)(p+2)/6 possibly 
distinct elements. As an example, let z = (Z1, Zo, Z3)" and “ijk = E (ZiZj Zk), 
fori, j,k = 1,...,3. Then the third moment of z is 


Hill #112 #113 #211 #212 #213 #311 M312 4313 
M3 (Z) = H121 122 4123 221 222 223 321 322 323 
H131 #132 #133 #231 @232 4233 331 M332 333 


In particular, if all components of z are standardised, its third moment is scale-free, 
exactly like many univariate measures of skewness. 

Moments of linear transformations y = Az admit simple representations in terms 
of matrix operations. For example, the expectation E(y) = AE(z) is evaluated via 
matrix multiplication only. The variance V(y) = AV(z)A” is evaluated using both 
the matrix multiplication and transposition. The third moment 3 (y) is evaluated 
using the matrix multiplication, transposition and the tensor product: 


Proposition 1. Let z be a p-dimensional random vector with finite third moment 
H3 (z) and let A beak x p real matrix. Then the third moment of Az is 43 (Az) = 
Az (z) (AT @ A’). 


The third central moment of a random variable is zero, when it is finite and the 
corresponding distribution is symmetric. There are several definitions of multivariate 
symmetry. For example, a random vector z is said to be centrally symmetric at yw if 
zZ—and mu —z are identically distributed [15]. The following proposition generalises 
this result to the multivariate case. 


Proposition 2. [f the random vector z is centrally symmetric and the third central 
moment is finite, it is a null matrix. 


Sometimes it is more convenient to deal with cumulants, rather than with moments. 
The following proposition generalises to the multivariate case a well known identity 
holding for random variables. 


Proposition 3. The third central moment of a random vector equals its third cumulant, 
when they both are finite. 


The following proposition simplifies the task of finding entries of 3 (z) correspond- 


ing to E (Z;ZjZx). 


Proposition 4. Let z = (Z Lyte y LZ, Ae be a random vector whose third moment 
13 (z) is finite. Then u3(z)=(M,,..., Mp), where M; = E (Zize™): 


A skewed GARCH-type model for multivariate financial time series 145 


Hence E (Zi Zj Zk) is in the ith row and in the jth column of the kth matrix Mx. 
There is a simple relation between the third central moment and the first, second and 
third moments of a random vector [9, page 187]: 


Proposition 5. Let w = py (z), “2 (Z), £3 (Z) the first, second and third moments 

of the random vector z. Then the third central moment of z can be represented by 
£ 

43 (2) — #2) @u" — pu" @ ure) —-u[ ny @] +24 @ nu" @u", where AY 

denotes the vector obtained by stacking the columns of the matrix A on top of each 

other. 


3 The multivariate skew-normal distribution 


We denote by z ~ SNp (Q, a) a multivariate skew-normal random vector [2] with 
scale parameter Q and shape parameter a. Its probability density function is 


f (z:Q, a) = 2d (2; 9) (a7), z,a€ RP,Qe RPx RP, (1) 


where © (-) is the cumulative distribution function of a standard normal variable and 
hp (Z; Q) is the probability density function of a p-dimensional normal distribution 
with mean 0, and correlation matrix Q. Expectation, variance and third central mo- 
ment of z ~ SN> (Q, a) (i.e., its first three cumulants) have a simple analytical form 
[1, 8]: 


B()= 28 V (@) =9-=30", 73) = 2 (4-1) 500700" (2) 


where 6 = Qa/V1 +a! Qa. As a direct consequence, the third cumulant of z ~ 
SN» (Q, a) needs only p parameters to be identified, and is a matrix with negative 
(null) entries if and only if all components of 6 are negative (null) too. 

The second and third moments of z ~ SNp (Q, a) have a simple analytical form 
too: 


a(Z=E (227) =Q, (3) 


2 T 
(2) == [07 @0+5(0") +260" — 303" 86"). (4) 
7 


Expectation of zz’ depends on the scale matrix Q only, due to the invariance prop- 
erty of skew-normal distributions: if z ~ SN» (Q, a) then zek ~ W(Q, 1), ie., a 
Wishart distribution depending on the matrix Q only [11]. As a direct consequence, 
the distribution of a function g (-) of z satisfying g(z) = g (—z) does not depend 
either on @ or on 0. 

The probability density function of the ith component z; of z ~~ SNp (Q, a) is 


07 Zi 


ep 


f Gi) = 26 (zi) ® (5) 
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where 6; is the ith component of 6 and ¢ denotes the pdf of a standard normal 
distribution. The third moment of the corresponding standardised distribution is 


3 
Oi 


[t — 26 


Hence positive (negative) values of 6; lead to positive (negative) skewness. More- 
over, positive values of 6; lead to Fj (0) > 1 — F; (0) when x > 0, with F; denoting 
the cdf of z;. 


V2(4—-2) (6) 


4 A skewed GARCH-type model 


In order to describe skewness using a limited number of parameters, we shall introduce 
the following model for a p-dimensional vector of financial returns x;: 


Xp= Dre, & = 2% — E(t), %~ SNp(Q,a), D, = diag (o11,..., Epr) 


(7) 
q q+p 
Dir ns Ray) ed (8) 
Oy = WOK TY RX Kj Ojkok r4q—j> 
i=l] j=qtl1 


where ordinary stationarity assumptions hold and {z;} is a sequence of mutually 
independent random vectors. 

The following proposition gives the analytical moment of the third cumulant 
73 (x1) of a vector x; belonging to the above stochastic process. In particular it shows 
that 773 (x;) is negative (null) when all the elements in the vector 6 are negative (null) 
too. 


Proposition 6. Let {x;,t € Z} be a stochastic process satisfying (10), (11) and 
E (G11 0 j1ont) < +00 fori, j,h =1,..., p. Then 


2 (4 
13 (Xr) = 13 Or) = ZC = 1) Aus (or) (A @ A), (9) 


where A = diag (O1, Sead dp) and o, = (ou, ox ce 


Proof. We shall write 3 (y| w) and 73 (y| w) to denote the third moment and the 
third cumulant of the random vector y, conditionally on the random vector w. From 
the definition of {x;, t € Z} we have the following identities: 


13 (Xr| Or) = 3 {Dr [zr — E (Zr) or} = 3 (D121 or) - (10) 
Apply now linear properties of the third cumulant: 


3 (4:1 61) = Di 13 (21) (D; ® D,). (11) 


A skewed GARCH-type model for multivariate financial time series 147 


By assumption the distribution of z; is multivariate skew-normal: 


2 (4 
13 (X1| 04) -/2(2- i) D; (s@ 5” @ 6") (D; ® D;). (12) 


Consider now the following mixed moments of order three: 
E (Xin X jr Xer| 01) = 


2 (4 
2 (- = i) (Oj it) (djo;1) (Oxon) 1, Js k= 1, seey D. (13) 


We can use definitions of A and o; to write the above equations in matrix form: 


13 (%;| 07) = eC = i) (Aa;) ® (Ao;)' ® (Ag;)! . (14) 


Ordinary properties of tensor products imply that 


2 (4 
M3 io) = f2(4- 1) a (,@ 57 @o!) (A @ A). (15) 


By assumption E (Gi1 0 j10nt) < +00 for i, j,h = 1,..., p, so that we can take 
expectations with respect to o;: 
2 (4 T T 
13 Gi) = = (= -1) AE (4, 807 @ 07) (A@A). (16) 


The expectation in the right-hand side of the above equation equals ju3 (o;). Moreover, 
since P (oj; > 0) = 1, the assumption E (6110 j1On1) < +o0 fori, j,h =1,...,p 
also implies that E (o;;) < +oo fori = 1,..., p and that the expectation of x; equals 
the null vector. As a direct consequence, the third moment equals the third cumulant 
of x; and this completes the proof. O 


5 Data analysis 


This section deals with daily percent log-returns (i.e., daily log-returns multiplied by 
100) corresponding to the indices DAX30 (Germany), IBEX35 (Spain) and S&PMIB 
(Italy) from 01/01/2001 to 30/11/2007. The mean vector, the covariance matrix and 
the correlation matrix are 


—0.0064 1.446 1.268 1.559 1.000 0.847 0.837 
0.030 |, 1.268 1.549 1.531 and 0.847 1.000 0.794 }, (17) 
0.011 1.559 1.531 2.399 0.837 0.794 1.000 


respectively. Not surprisingly, means are negligible with respect to standard deviations 
and variables are positively correlated. Figure 1 shows histograms and scatterplots. 
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Fig. 1. Scatterplots and histograms for DAX30, IBEX35 and S&PMIB 
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We shall measure skewness using the following indices, defined as: 


n —\ 3 
Xie —X. 1-2 2 — 43 3 
=> (2 ). Bets IDS a arta ecg ah, 2 (tie) 

i=1 * ae 4 * 


where gq; is the ith quartile (¢ = 1, 2, 3). Their values for the three series are reported 
in Table 1. 


Table 1. Skewness coefficients 


Aj A2 A3 
S&PMib —0.227 —1.049 —0.095 
Tbex35 —0.046 —1.104 —0.081 
Dax30 —0.120 —1.076 —0.089 


All indices suggest negative skewness. In order to assess multivariate skewness, 
we shall consider the third cumulant and the third moment. The third sample cumulant 
is 

m3 (X) = — >° (xj —m)® (xj) — m)" ® (xj — mm)", (19) 
# i=l 


where x; is the transpose of the ith row of the n x p data matrix X and m is the mean 
vector. The third sample cumulants of the above data are 


0.394 0.201 0.396 0.201 0.092 0.242 0.396 0.242 0.414 
— | 0.201 0.092 0.242 0.092 0.088 0.146 0.242 0.146 0.216 ]}. (20) 
0.396 0.242 0.414 0.242 0.146 0.216 0.414 0.216 0.446 


The most interesting feature of the above matrix is the negative sign of all its 
elements. The third moment has a similar structure, since all entries but one are 
negative: 


0.422 0.173 0.340 0.173 0.025 0.199 0.340 0.199 0.3940 
— {| 0.173 0.025 0.190 0.025 -0.053 0.035 0.199 0.035 0.109 |. (21) 
0.340 0.190 0.390 0.199 0.035 0.109 0.394 0.109 0.365 


We found the same pattern in other multivariate financial time series from small 
markets. 


6 Sign tests for symmetry 
This section deals with formal testing procedures for the hypothesis of symmetry. 


When testing for symmetry, the default choice for a test statistic is the third stan- 
dardised moment, which might be inappropriate for financial data. Their dependence 
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structure and their heavy tails make it difficult to understand its sampling properties. 
On the contrary, the sign test for symmetry possesses very appealing sampling prop- 
erties, when the location parameter is assumed to be known [10, page 247]. When 
dealing with financial returns, it is realistic to assumed to be known and equal to zero, 
for theoretical as well as for empirical reasons. From the theoretical point of view, it 
prevents systematic gains or losses. From the empirical point of view, as can be seen 
in (17), means of observed returns are very close to zero. The following paragraphs in 
this section state model’s assumptions, describe the sign test for symmetry and apply 
it to the data described in the previous section. 

We shall assume the following model for a p-dimensional vector of financial 


returns x;: x; = Dy;é;, E(e;) = 0, D; = diag (ou, sea Opt) and 
qt+p 
ae 
Cin iets tei i+ >) Ope tg—j (22) 
i=]. j=qt1 


where ordinary stationarity assumptions hold and {¢;} is a sequence of mutually 
independent random vectors. We shall test the hypotheses 

ijk ijk 

Ho)”: Fijx 0) =1— Fijk ©) versus Hy”: Fijx 0) <1 — Fijx ©) (23) 
fori, j,k = 1, 2,3, where Fj; denotes the cdf of ¢; ;é;, ;€;,~. Many hypotheses Hit 
fori, j,k = 1,2,3 anda = 0, | are equivalent to each other and can be expressed in 
a simpler way. For example, Hy / and H i / are equivalent to F; (0) = 1 — F; (0) and 
F; (0) < 1 — F; (0), respectively, where F; denotes the cdf of ¢;,;. Hence it suffices 
to test the following systems of hypotheses 


Hi: F0)=1-F() versus Hi: F (0) <1-F(0),i=1,2,3 (24) 
and 
Hj” : Fi23 0) = 1 — Fi23 (0) versus}? : Fi23 (0) < 1 — Fi23 (0). (25) 


Let x), x? and x? denote the column vector of returns in the German, Spanish and 
Italian markets. The sign test rejects the ae hypothesis Hy U® j¢ the number nj ijk Of 
positive elements in the vector x! o x/o x* is larger than an assigned value, where 
“o” denotes the Schur (or Hadamard) product. Equivalently, it rejects Hy ‘if Zijk = 
1a (fi jk — 9. 5) i is larger than an assigned value, where fjjx is the inte frequency 
of positive elements in x; 0 x; o x, and n is its length. Under Hie Nijk ~ Bi (n, 0.5) 
and z;;x ~ N (0, 1), asymptotically. 

Table 2 reports the relative frequencies of positive components in x!, x 
x! ox? o x3, together with the corresponding test statistics and p-values. 

In all four cases, there is little evidence supporting the null hypothesis of symme- 
try against the alternative hypothesis of negative asymmetry. Results are consistent 
with the exploratory analysis in the previous section and motivate models describing 
multivariate negative skewness. 


2x? and 
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Table 2. Tests statistics 


Indices Frequency Statistic p-value 
Dax30 0.542 3.579 <0.001 
Tbex35 0.561 5.180 <0.001 
S&PMib 0.543 3.673 <0.001 
Product 0.544 3.767 <0.001 


7 Conclusions 


We considered the third cumulant of multivariate financial returns, motivated it 
through a real data example and modeled it through the multivariate skew-normal 
distribution. Preliminary studies hint that negative third cumulants might constitute 
a stylised fact of multivariate financial returns [13], but more studies are needed to 
confirm or disprove this conjecture. By proposition 2, testing for central symmetry 
would be a natural way for doing it. [14] gives an excellent overview of the literature 
on this topic. Multivariate GARCH-type models with skew-normal errors might be 
helpful in keeping under control the number of parameters, but some caution is needed 
when using maximum likelihood procedures, since it is well known that sometimes 
they lead to frontier estimates. 
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Financial time series and neural networks in a 
minority game context 


Luca Grilli, Massimo Alfonso Russo, and Angelo Sfrecola 


Abstract. In this paper we consider financial time series from U.S. Fixed Income Market, 
S&P500, DJ Eurostoxx 50, Dow Jones, Mibtel and Nikkei 225. It is well known that financial 
time series reveal some anomalies regarding the Efficient Market Hypothesis and some scaling 
behaviour, such as fat tails and clustered volatility, is evident. This suggests that financial time 
series can be considered as “pseudo”-random. For this kind of time series the prediction power 
of neural networks has been shown to be appreciable [10]. At first, we consider the financial 
time series from the Minority Game point of view and then we apply a neural network with 
learning algorithm in order to analyse its prediction power. We prove that the Fixed Income 
Market shows many differences from other markets in terms of predictability as a measure of 
market efficiency. 


Key words: Minority Game, learning algorithms, neural networks, financial time series, Ef- 
ficient Market Hypothesis 


1 Minority games and financial markets 


At the very beginning of the last century Bachelier [2] introduced the hypothesis 
that price fluctuations follow a random walk; this resulted later in the so-called Effi- 
cient Market Hypothesis (EMH). In such markets arbitrages are not possible and so 
speculation does not produce any gain. Later, empirical studies showed that the im- 
plications of EMH are too strong and the data revealed some anomalies. Even though 
these anomalies are frequent, economists base the Portfolio Theory on the assumption 
that the market is efficient. One of the most important implications of EMH is the 
rationality of all agents who are gain maximisers and take decisions considering all 
the information available (which have to be obtained easily and immediately) and in 
general do not face any transaction costs. Is it realistic? The huge literature on this 
subject shows that an answer is not easy but in general some anomalies are present in 
the market. One of the main problems is rationality; as a rule, agents make satisfac- 
tory choices instead of optimal ones; they are not deductive in making decisions but 
inductive in the sense that they learn from experience. As a consequence, rationality 
hypothesis is often replaced by the so-called “Bounded Rationality”; see [13] for 


M. Corazza et al. (eds.), Mathematical and Statistical Methods for Actuarial Sciences and Finance 
© Springer-Verlag Italia 2010 


154 L. Grilli, M.A. Russo, and A. Sfrecola 


more details. Empirical studies also show cluster formations and other anomalies in 
financial time series [3,9]. 

In order to model the inductive behaviour of financial agents, one of the most 
famous examples is the Minority Game (MG) model. The MG is a simple model of 
interacting agents initially derived from Arthur’s famous El Farol’s Bar problem [1]. A 
popular bar with a limited seating capacity organises a Jazz-music night on Thursdays 
and a fixed number of potential customers (players) has to decide whether to go or 
not to go to the bar. If the bar is too crowded (say more than a fixed capacity level) 
then no customer will have a good time and so they should prefer to stay at home. 
Therefore every week players have to choose one out of two possible actions: to stay 
at home or to go to the bar. The players who are in the minority win the game. 

Since the introduction of the MG model, there have been, to date, 200 papers on 
this subject (there is an overview of literature on MG at the Econophysics website). 

The MG problem is very simple, nevertheless it shows fascinating properties and 
several applications. The underlying idea is competition for limited resources and 
it can be applied to different fields such as stock markets (see [5—7] for a complete 
list of references). In particular the MG can be used to model a very simple market 
system where many heterogeneous agents interact through a price system they all 
contribute to determine. In this market each trader has to take a binary decision every 
time (say buy/sell) and the profit is made only by the players in the minority group. 
For instance, if the price increases it means that the minority of traders are selling and 
they get profit from it. This is a simple market where there is a fixed number of players 
and only one asset; they have to take a binary decision (buy/sell) in each time step 
t. When all players have announced their strategies the prices are made according to 
the basic rule that if the minority decides to sell, then the price grows (the sellers get 
profit); if the minority decide to buy, then price falls (the buyers get profit). 

In this model cooperation is not allowed; players cannot communicate and so they 
all get information from the global minority. In order to make decisions, players use 
the global history of the past minorities or, in most cases, a limited number of past 
minorities that can be considered the time window used by the player. In our case 
the global history is given by the time series of price fluctuations. Let us consider 
the set of players i = {1,..., N} where N e€ N (odd and fixed). Indicate with ¢ the 
time step when each player makes a decision. In the market there is one asset and the 
possible decision in each time step is buy or sell; as a consequence the player i at 
time t chooses o/ € {+1, —1} (buy/sell). 

In each time step ¢, let p; be the price of the asset at time ft; the minority (the 
winning strategy) is determined by 


Si= -signtog ( it ) 
Pt-1 


Consequently, the time series of price fluctuations is replaced by a time series 
consisting of two possible values: +1 and —1 (the minority decisions). 

In [4] it is shown that often similar results can be obtained by replacing the real 
history with an artificial one. 
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The main point is that we suppose that players make their decisions according to 
a learning rule, as a consequence they follow an inductive behaviour and this affects 
the time series of the minority decisions that is not simply a random sequence of 
—1 and +1. If this is the case the time series can be defined “pseudo-random” as a 
consequence of the periodicity derived from the generating rule. This periodicity is not 
due to the presence of a “trend” which is buried under noise but it is a consequence of 
the inductive behaviour of players and this is the reason why classical techniques such 
as simple autocorrelation analysis do not give us information, by definition, on the 
learning procedure. On the contrary, the neural network with an appropriate learning 
algorithm can capture such “regularities” very well and consequently can predict the 
time series as shown in [10]. The main result presented in [10] is that a neural network 
with an appropriate learning algorithm can predict “pseudo-random” time series very 
well whatever the learning algorithm. On the other hand the neural network is not 
able to predict a randomly generated time series. As a consequence, if we apply a 
similar analysis to financial time series in the MG context presented before, we can 
test for EMH since bad results in terms of prediction power of the neural network can 
suggest that EMH is fulfilled and time series are randomly generated. If this is not the 
case and the prediction power is remarkable then the time series is “pseudo-random” 
as a consequence of inductive behaviour of the players. The neural network approach 
also reveals the time window of past decisions that players are considering in order 
to make their choice. As we will see, it is dependent on the market we consider. 


2 Neural network and financial time series 


The main issue of this paper is to determine the predictability of financial time se- 
ries taking into account the imperfection of the market as a consequence of agents’ 
behaviour. In [10] it is shown that, when players make their decisions according to 
some learning rule, then the time series of the minority decisions is not simply a 
random sequence of —1 and +1. The time series generated with learning algorithms 
can be defined as “pseudo-random” time series. The reason is that, by construction, it 
presents a sort of periodicity derived from the generating algorithm. This periodicity 
is not evident directly from the time series but a neural network with an appropriate 
learning algorithm can capture such “regularities” and consequently can predict the 
time series. The authors show that, for three artificial sequences of minority deci- 
sions generated according to different algorithms, the prediction power of the neural 
network is very high. 

In this paper we suppose that each player, in order to make her decision, is provided 
with a neural network. We consider time series from U.S. Fixed Income Market, 
S&P500, DJ Eurostoxx 50, Dow Jones, Mibtel and Nikkei 225 (all the time series 
from Jan 2003 to Jan 2008, daily prices, data from the Italian Stock Exchange). 

Following the motivations presented in [10], in this paper we consider a neural 
network that uses the Hebbian Learning algorithm to update the vector of weights. 
The neural network is able to adjust its parameters, at each round of the game, and 
so the perceptron is trying to learn the minority decisions. If S indicates the minority 
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decision and superscript + indicates the updated value of a parameter, the vector of 
minority decision x whose component x’ € {+1, —1} is the entry of the time series 
at instant t. Let us suppose that each player is provided with such a neural network 
and makes her decision according to the following rules: 


oj = sign(x - @;) 


N 
os = Oj — x sign 2 signe -@j)) =a + ax, 
i= 
where w is a M-dimensional weight vector from which decisions are made. So each 
player uses an M-bit window of the past minority decisions as an input in order to 
adjust the weights and try to capture the “regularities” in the time series. As we can 
see later, the choice of M is often crucial in order to determine the best prediction 
power. It is possible to compute the number of predictions as a function of M in 
order to obtain the value of M for which it is maximum. In this case the parameter 
M indicates how many past price fluctuations are considered by the agent in order to 
make a decision. The parameter 7 is the learning rate. 

In [10] it is shown that it is crucial to select the window of past entries to consider 
as an input for current decision correctly, that is the choice of parameter M. The 
authors show that the number of corrected predictions is maximum if the neural 
network uses the same M as the sequence generator. This suggests that, if an M value 
exists for which the neural network predictions are maximum then it is possible to 
infer that the sequence of minority decisions is generated by a learning algorithm 
with exactly the same value M. If we apply the same arguments to financial markets 
time series, the presence of a value M for which the number of corrected predictions 
is maximum indicates that the time series is generated by a learning algorithm with 
that parameter M, that is the length of the time window used by the investor, and this 
is key information derived exclusively with this approach. 

Moreover, to determine this value we analyse the number of predictions of the 
neural network as a function of M. Figures 1, 2,4, 5, 6 and 7 show the results of these 
simulations. The result is different according to the market considered; in particular 
the case of U.S. Treasury Bond seems to be the most interesting. In this market the 
maximum is reached for M = 32, that is the dimension of the temporal window of 
the past minority decisions to consider as an input of the neural network. The case of 
S&P500, DJ Eurostoxx 50, Dow Jones, Mibtel and Nikkei 225 is completely different; 
the maximum value for M is, in general, very low (M = 3 — 5). This can suggest 
that in these markets investors look at the very recent past to make decisions and do 
not consider what has happened in the remote past. On the other hand, Fixed Income 
Market presents a different situation and it seems to be the most predictable since the 
number of predicted entries is the highest one (about 60% of corrected predictions). 
This can be explained according to some features that make this market different since 
it must follow common laws dictated by macroeconomic variables. As a consequence, 
the data present a strong positive correlation between bonds [9]. Another reason is that 
usually only large investors (like insurance companies or mutual funds) are interested 
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Fig. 1. Number of corrected predictions as a function of M in the case of U.S. Treasury Bond. 
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in long-term bonds, so the expectations about the market evolution are so similar that 
the behaviour of long-term bond prices does not reflect any difference in the perceived 
value of such assets [3]. 

The analysis shows that the number of corrected predictions is dependent on the 
parameter M; it is not the same for the parameter 7 since the number of corrected 
predictions remains quite constant (we report it in Fig. 3). 

The neural network approach has shown the presence of a value M for which the 
prediction power is maximum and this is a signal in the direction that the time series 
is “pseudo-random” and agents use M as the time window. This can be interpreted 
in terms of lack of EMH, but this is only partially true since the neural network 
cannot predict in a significant way (in terms of number of corrected predictions) 
the financial time series considered and this can indicate that these time series are 
randomly generated and so these markets are efficient. This result is not surprising 
since all the markets considered in this paper present a huge number of transactions 
and huge volumes, and information provided to agents is immediately and easily 
available. The neural network can predict, for these time series, slightly more than 
50%, which is the expected value of corrected predictions in cases where choices are 
randomly made, which is a signal in the direction that these markets fulfil the EMH 
and results in other directions can be considered “anomalies”. A comparative analysis 
reveals that the Fixed Income Markets seems to the least efficient since the number 
of predictions is maximal. 


3 Conclusions 


In [10] the authors show that in an MG framework, a neural network that use a Hebbian 
algorithm can predict almost every minority decision in the case in which the sequence 
of minority decisions follows a “pseudo”-random distribution. The neural network 
can capture the “periodicity” of the time series and then predict it. On the other hand 
they show that the prediction power is not so good when the time series is randomly 
generated. In this paper we consider financial time series from U.S. Fixed Income 
Market, S&P500, DJ Eurostoxx 50, Dow Jones, Mibtel and Nikkei 225. If agents 
make satisfactory choices instead of optimal ones, they are inductive in the sense that 
they learn from experience and MG is a very good model for inductive behaviour of 
financial agents. If financial time series are generated by some learning procedure, 
then we can consider financial time series as “pseudo”’-random time series and in this 
case the prediction of neural networks is appreciable. So we consider the financial 
time series from the Minority Game point of view and then we apply a neural network 
with learning algorithm in order to analyse its prediction power as a measure of market 
efficiency. 

We show that the case of U.S. Treasury Bond seems to be the most interesting 
since the time window of the past minorities considered by the investor is M = 32, 
which is very high with respect to other markets, and for this time series the neural 
network can predict about 60% of entries. This is a signal in the direction that the 
Fixed Income Market is more predictable as a consequence of features that make 
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this market different. On the other hand the case of S&P500, DJ Eurostoxx 50, Dow 
Jones, Mibtel and Nikkei 225 is completely different, as these markets’ investors 
consider only the very recent past since M = 2 — 4 and the neural network can 
predict slightly more than 50% of entries. This can lead us to consider these time 
series as randomly generated and so consider these markets more efficient. In both 
cases the neural network shows the presence of a value M for which the number of 
predictions is maximum and this is the number of past entries that agents consider in 
order to make decisions. This information is derived directly from the data. 
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Robust estimation of style analysis coefficients 


Michele La Rocca and Domenico Vistocco 


Abstract. Style analysis, as originally proposed by Sharpe, is an asset class factor model aimed 
at obtaining information on the internal allocation of a financial portfolio and at comparing 
portfolios with similar investment strategies. The classical approach is based on a constrained 
linear regression model and the coefficients are usually estimated exploiting a least squares 
procedure. This solution clearly suffers from the presence of outlying observations. The aim of 
the paper is to investigate the use of a robust estimator for style coefficients based on constrained 
quantile regression. The performance of the novel procedure is evaluated by means of a Monte 
Carlo study where different sets of outliers (both in the constituent returns and in the portfolio 
returns) have been considered. 


Key words: style analysis, quantile regression, subsampling 


1 Introduction 


Style analysis, as widely described by Horst et al. [12], is a popular and important 
tool in portfolio management. Firstly, it can be used to estimate the relevant factor 
exposure of a financial portfolio. Secondly, it can be a valuable tool in performance 
measurement since the style portfolio can be used as a benchmark in evaluating the 
portfolio performance. Finally, it can be used to gain highly accurate future portfo- 
lio return predictions since it is well known from empirical studies [12] that factor 
exposures seem to be more relevant than actual portfolio holdings. 

The method, originally proposed by Sharpe [25], is a return-based analysis aimed 
at decomposing portfolio performance with respect to the contribution of different 
constituents composing the portfolio. Each sector is represented by an index whose 
returns are available. The model regresses portfolio returns on constituent returns 
in order to decompose the portfolio performance with respect to each constituent. 
Indeed, in the framework of classical regression, the estimated coefficients mean the 
sensitivity of portfolio expected returns to constituent returns. The classical approach 
is based on a linear regression model, estimated by using least squares, but different 
constraints can be imposed on the coefficients. 
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Following Horst et al. [12], we distinguish three types of style models: 


i. weak style analysis: the coefficients are estimated using an unconstrained regres- 
sion model; 

ii. semi-strong style analysis: the coefficients are imposed to be positive; 

iii. strong style analysis: the coefficients are imposed to be positive and to sum up to 
one. 


The three types of style model are typically estimated as regression through the origin. 

The use of the double constraint (strong style analysis) and the absence of the 
intercept allow the interpretation of the regression coefficients in terms of compo- 
sition quotas and the estimation of the internal composition of the portfolio [8, 9]. 
Notwithstanding, classical inferential procedures should be interpreted with caution, 
due to the imposition of inequality linear constraints [13]. Some general results are 
available for the normal linear regression model [11]; a different approach based on 
Bayesian inference is formulated in [10]. 

In the framework of style analysis, a commonly applied solution is the approx- 
imation proposed by Lobosco and Di Bartolomeo [21]. These authors obtain an ap- 
proximate solution for the confidence intervals of style weights using a second-order 
Taylor approximation. The proposed solution works well except when the param- 
eters are on the boundaries, i.e., when one or more parameters are near 0 and/or 
when a parameter falls near 1. Kim et al. proposes two approximate solutions for this 
special case [14] based on the method of Andrews [1] and on the Bayesian method 
proposed by Geweke [11]. A different Bayesian approach is instead discussed by 
Christodoulakis [6,7]. 

As they are essentially based on a least-squares estimation procedure, common 
solutions for the estimation of the style analysis coefficients suffer from the presence 
of outliers. In this paper we investigate the use of quantile regression [18] to estimate 
style coefficients. In particular we compare the classical solution for the strong style 
model with robust estimators based on constrained median regression. Different sets 
of outliers have been simulated both in constituent returns and in portfolio returns. The 
estimators are then compared with respect to efficiency and some considerations on the 
consistency of the median regression estimator is provided too. The use of the quantile 
regression approach allows a further gain in efficiency as an L-estimator [15, 19] can 
be easily obtained using linear combinations of quantile estimators, i.e., for different 
conditional quantiles. 

The paper is organised as follows: in the next section the classical Sharpe-style 
model is briefly introduced along with the basic notation. In Section 3 the quantile 
regression approach to style analysis is described. The simulation schema and the 
main results are discussed in Section 4. Finally, some concluding remarks and possible 
further developments are provided in Section 5. 
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2 Sharpe-style regression model 


The Sharpe-style analysis model regresses portfolio returns on the returns of a variety 
of investment class returns. The method thus identifies the portfolio style in the time 
series of its returns and of constituent returns [12]. The use of past returns is a Hobson’s 
choice as typically there is no other information available to external investors. 

Let us denote by r?°”' the random vector of portfolio returns along time and by 
R"*" the matrix containing the returns along time of the i” portfolio constituent on 
the i!” column (i = 1,...,n). Data refer to T subsequent time periods. The style 
analysis model regresses portfolio returns on the returns of the 1 constituents: 


pet = Reonst worst + e s.t.: woonst > 0, qt woonst = 1. 


The random vector e can be interpreted as the tracking error of the portfolio, where 
1(ROO"e@ = 0). 

Style analysis models can vary with respect to the choice of style indexes as well 
as with respect to the specific location of the response conditional distribution they 
are estimating. The classical style analysis model is based on a constrained linear 
regression model estimated by least squares [25,26]. This model focuses on the 
conditional expectation of portfolio returns distribution E(r?°" | R°°”"*‘): estimated 
compositions are interpretable in terms of sensitivity of portfolio expected returns to 
constituent returns. 

The presence of the two constraints imposes the coefficients to be exhaustive 
and non-negative, thus allowing their interpretation in terms of compositional data: 
the estimated coefficients mean constituent quotas in composing the portfolio. The 
ROO"s' wes! term of the equation can be interpreted as the return of a weighted 
portfolio: the portfolio with optimised weights is then a portfolio with the same 
style as the observed portfolio. It differs from the former as estimates of the internal 
composition are available [8,9]. We refer the interested reader to the paper of Kim et 
al. [14] for the assumptions on portfolio returns and on constituent returns commonly 
adopted in style models. 

In the following we restrict our attention to the strong style analysis model, i.e., 
the model where both the above constraints are considered for estimating style coef- 
ficients. Even if such constraints cause some problems for inference, the strong style 
model is nevertheless widespread for the above-mentioned interpretation issues. 


3 A robust approach to style analysis 


Quantile regression (QR), as introduced by Koenker and Basset [18], can be viewed 
as an extension of classical least-squares estimation of conditional mean models to 
the estimation of a set of conditional quantile functions. For a comprehensive review 
of general quantile modelling and estimation, see [16]. 

The use of QR in the style analysis context was originally proposed in [5] and 
revisited in [2] and [3]. It offers a useful complement to the standard model as it 
allows discrimination of portfolios that would be otherwise judged equivalent [4]. 
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In the classical approach, portfolio style is determined by estimating the influence 
of style exposure on expected returns. Extracting information at places other than 
the expected value should provide useful insights as the style exposure could affect 
returns in different ways at different locations of the portfolio returns distribution. By 
exploiting QR, amore detailed comparison of financial portfolios can then be achieved 
as QR coefficients are interpretable in terms of sensitivity of portfolio conditional 
quantile returns to constituent returns [5]. The QR model for a given conditional 
quantile @ can be written as: 


Oo (Po! | RoOo"") — ROMS yy const (0) s.t.: woonst (0) > 0, yiwoorst (0) = 1, vO, 


where 6 (0 < @ < 1) denotes the particular quantile of interest. 

As for the classical model, the w°?"* (9) coefficient of the QR model can be 
interpreted as the rate of change of the #th conditional quantile of the portfolio returns 
distribution for a unit change in the ith constituent returns holding the values of Roe F 
constant. 

The conditional quantiles are estimated through an optimisation function minimis- 
ing a sum of weighted absolute deviation where the choice of the weight determines 
the particular conditional quantile to estimate. We refer to Koenker and Ng [20] for 
computing inequality constrained quantile regression. 

The use of absolute deviations ensures that conditional quantile estimates are 
robust. The method is nonparametric in the sense that it does not assume any specific 
probability distribution of the observations. In the following we use a semiparametric 
approach as we assume a linear model in order to compare QR estimates with the 
classical style model. Moreover we restrict our attention to the median regression 
by setting 9 = 0.5. As previously stated, it is worthwile to mention that the use of 
different values of @ allows a set of conditional quantile estimators to be obtained that 
can be easily linearly combined in order to construct an L-estimator, in order to gain 
efficiency [15,19]. 


4 Simulation results 


In this section the finite sample properties of the proposed procedure are investigated 
via a Monte Carlo study. Artificial fund returns are simulated using the following 
data-generation process: 


Port = Here woot toe, t=1,2,...,T. 

In particular, we considered a portfolio with 5 constituents generated by using 
GARCH(1,1) processes to simulate the behaviour of true time series returns. The true 
style weights have been set to wf?” = 0.2,i = 1,2,...,5, thus mimicking a typ- 
ical “buy and hold” strategy. This allows a better interpretation of simulation results 
whereas the extension to different management strategies does not entail particular 
difficulties. The scaling factor o has been fixed in order to have R* close to 0.90 
while e, ~ N(O, 1). We considered additive outliers at randomly chosen positions 
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both in the constituent series (in X) and in the portfolio returns (in Y). The positions 
of outlier contamination has been set both to 1% and to 5%. We considered median 
regression (9 = 0.5) and we used T=250, 500, 1000 as sample sizes. We carried out 
1000 Monte Carlo runs for each simulation of the experimental set up. 

Figure | depicts the impact of outlying observations on LS and QR estimators. 
Each row of the panel graph refers to a portfolio constituent (i = 1,...,5) while 
the columns show the different cases of presence of outliers: no outliers, outliers 
in portfolio returns (in Y), outliers in constituent returns (in X), and outliers both in 
portfolio returns and in constituent returns (in X and Y). In each panel the left boxplot 
refers to the LS estimator while the right one depicts the QR estimator behaviour. As 
expected, the impact of outlying observation can be very serious on LS estimates 
of style coefficients, especially when considering outliers in the constituent series. 
It is worth noticing that the variability of LS estimates increases very much and 
this can have serious practical drawbacks since the style coefficients vary in the 
unit interval: a large variability of the estimates induces results with limited practical 
utility. Clearly, when no outlying observations are present in the data, the LS estimates 
are more efficient than quantile estimates. However, the differences between the two 
distributions are not so evident. Although it is well known that quantile regression 
estimators are robust only in Y [16], the simulation study shows more evidence of 
robustness in the case of outliers in constituent returns (third column of panels in 
Fig. 1). A possible explanation can be given by considering the presence of the 
double constraint, which forces each estimated coefficient to be inside the unit interval. 
However, a formal study based on the influence function of the constrained estimators 
is not available at the moment. This issue is still under investigation. 

In order to obtain information on consistency of the constrained median esti- 
mators, we use different values for the length of time series. Figure 2 depicts the 
behaviour of the QR estimator for T = 250 (left boxplot in each panel), T = 500 
(middle boxplot) and T = 1000 (right boxplot). As in the previous figure, the rows 
of the plot refer to the different constituents while the columns report the different 
cases treated in our simulation with respect to the presence of outlying observations. 
It is evident that in any case efficiency increases as sample size increases. 

Figures | and 2 are built using a percentage of outlier contamination set to 1%. 
Similar patterns have been noticed for the case of 5% contamination and so the 
related plots are not reported here for the sake of brevity. Using a percentage of 
outlier contamination set to 5%, as expected, an increase in the variability of the QR 
estimator is observed, although there is only a very limited difference between the 
two cases. For the sake of space we do not include any results for the comparison 
between the different cases of outlier contamination considered. It is straightforward 
to note, anyway, that increases in the variability of the QR estimator due to an increase 
in the percentage of outlier contamination are counterbalanced moving the number 
of observations from T = 250 to T = 500 and then to T = 1000. 
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Fig. 1. Comparison of the least-squares (LS) estimators and of the median estimators through 
quantile regression (QR) for T = 250. The different subpanels of the two plots refer to the 
portfolio constituents (rows) and to the different cases of presence of outliers (columns). In 
particular the first column depicts the situation with no outlying observation, the second and 
third columns refer, respectively, to the presence of outliers in portfolio returns and outliers in 
constituent returns, while the last column depicts the behaviour of LS and QR estimates when 
outliers are considered both in portfolio returns and in constituent returns. In each panel the 
left boxplot depicts the sampling distribution of the LS estimator while the right one refers to 
the sampling distribution of the QR estimator 


5 Conclusions and further issues 


Style analysis is widely used in financial practice in order to decompose portfolio 
performance with respect to a set of indexes representing the market in which the 
portfolio invests. The classical Sharpe method is commonly used for estimating pur- 
poses but requires corrections in case of the presence of outliers. In this paper we 
compare this classical procedure with a robust procedure based on a constrained me- 
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Fig. 2. Comparison of the median estimators through quantile regression (QR) for T = 250, 
T = 500 and T = 1000. The different subpanels of the two plots refer to the portfolio 
constituents (rows) and to the different cases of the presence of outliers (columns). In particular 
the first column depicts the situation with no outlying observation, the second and third columns 
refer, respectively, to the presence of outliers in portfolio returns and outliers in constituent 
returns, while the last column depicts the case when outliers are considered both in portfolio 
returns and in constituent returns. The boxplots depict the sampling distributions of the QR 
estimators for T = 250 (left boxplot), T = 500 (middle boxplot) and T = 1000 (right boxplot) 


dian regression, showing some empirical results for efficiency and consistency of 
the robust estimators. The results of the simulation study encourage us to further 
investigate this approach. A topic deserving further attention is a formal study of the 
robustness of the constrained median regression estimator in the presence of outliers 
in the X series based on the influence function of the constrained robust estimator. 
Itis worthwileto point out that further gain in estimator efficiency can be obtained 
as the median regression has been estimated through quantile regression. Such a 
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technique allows a simple extension toward L-estimators (defined as weighted linear 
combinations of different quantiles) in order to gain an increase in efficiency [15,19]. 
Moreover, many other robust estimators have been proposed and studied for linear 
regression models. However a comparison of their relative merits in the framework 
considered here is beyond the scope of this paper. 

A further extension of the proposed approach concerns the use of quantile regres- 
sion to draw inferences on style coefficients. The presence of inequality constraints 
in the style model, indeed, requires some caution in drawing inferences. Among the 
different proposals appearing in the literature, the Lobosco—Di Bartolomeo approx- 
imation [21] for computing corrected standard errors is widespread and it performs 
well for regular cases, i.e., when parameters are not on the boundaries of the param- 
eter space. This proposal, indeed, is a convenient method for estimating confidence 
intervals for style coefficients based on a Taylor expansion. Nevertheless, as it is es- 
sentially based on a least-squares estimation procedure, the Lobosco—Di Bartolomeo 
solution also suffers from the presence of outliers. A possible solution could relate to 
a joint use of quantile regression and subsampling theory [23]. Subsampling was first 
introduced by Politis and Romano [22] and can be considered as the most general 
theory for the construction of first-order asymptotically valid confidence intervals or 
regions. The basic idea is to approximate the sampling distribution of the statistic 
of interest through the values of the statistic (suitably normalised) computed over 
smaller subsets of the data. Subsampling has been shown to be valid under very weak 
assumptions and, when compared to other resampling schemes such as the bootstrap, 
it does not require that the distribution of the statistic is somehow locally smooth 
as a function of the unknown model. Indeed, the subsampling is applicable even in 
situations that represent counterexamples to the bootstrap. These issues are still un- 
der investigation and beyond the scope of this paper. Here it is worth highlighting 
that preliminary results appear promising and encourage us to further investigate this 
approach: confidence intervals based on the joint use of QR and subsampling show 
better performance with respect to both coverage error and length of the intervals. 
The next step should concern an empirical analysis with real financial series. 
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Managing demographic risk in enhanced pensions 


Susanna Levantesi and Massimiliano Menzietti 


Abstract. This paper deals with demographic risk analysis in Enhanced Pensions, i.e., long- 
term care (LTC) insurance cover for the retired. Both disability and longevity risks affect such 
cover. Specifically, we concentrate on the risk of systematic deviations between projected and 
realised mortality and disability, adopting a multiple scenario approach. To this purpose we 
study the behaviour of the randomrisk reserve. Moreover, we analyse the effect of demographic 
risk on risk-based capital requirements, explaining how they can be reduced through either 
safety loading or capital allocation strategies. A profit analysis is also considered. 


Key words: long term care covers, enhanced pension, demographic risks, risk reserve, sol- 
vency requirements 


1 Introduction 


The “Enhanced Pension” (EP) is a long-term care (LTC) insurance cover for the 
retired. It offers an immediate life annuity that is increased once the insured becomes 
LTC disabled and requires a single premium. EP is affected by demographic risks 
(longevity and disability risks) arising from the uncertainty in future mortality and 
disability trends that cause the risk of systematic deviations from the expected values. 
Some analyses of these arguments have been performed by Ferri and Olivieri [1] 
and Olivieri and Pitacco [6]. To evaluate such a risk we carry out an analysis taking 
into account a multiple scenario approach. To define a set of projected scenarios we 
consider general population statistics of mortality and disability. 

We firstly analyse the behaviour of the risk reserve, then we define the capital 
requirements necessary to guarantee the solvency of the insurer. Finally we study the 
portfolio profitability. Such an analysis cannot be carried out by analytical tools, but 
requires a Monte Carlo simulation model. 

The paper is organised as follows. In Section 2 we define the actuarial framework 
for EPs. In Section 3 we develop nine demographic scenarios and describe through a 
suitable model how they can change over time. In Section 4 we present a risk theory 
model based on the portfolio risk reserve and the Risk Based Capital requirements 
necessary to preserve the insurance company from failures with a fixed confidence 
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level. Section 5 deals with the profit analysis of the portfolio according to the profit 
profile. Simulation results are analysed in Section 6, while some concluding remarks 
are presented in Section 7. 


2 Actuarial model for Enhanced Pensions 


The probabilistic framework of an EP is defined consistently with a continuous and 
inhomogeneous multiple state model (see Haberman and Pitacco [2]). Let S(t) rep- 
resent the random state occupied by the insured at time ¢, for any t > 0, where f is 
the policy duration and 0 the time of entry. The possible realisations of S(t) are: 1 = 
“active” (or healthy), 2 = “LTC disabled” or 3 = “dead”. We disregard the possibility 
of recovery from the LTC state due to the usually chronic character of disability and 
we assume S(0) = 1. Let us define transition probabilities and intensities: 


Pij(t,u) =Pr{Su) = jf |S@) =i} O<t<u, ij e {1,2, 3}, (1) 


_ Pij(t,u) ES: eats 
hij (t) = lim —— 1¢2>0, i,je€ {1,2,3}, if j. (2) 


u->t u—t 
EPs are single premium covers providing an annuity paid at an annual rate bj (t) when 
the insured is healthy and an enhanced annuity paid at an annual rate b2(t) > bj (t) 
when the insured is LTC disabled. Let us suppose all benefits to be constant with 
time. Let @ be the maximum policy duration related to residual life expectancy at age 
x and let v(s, t) = [ei v(h — 1, h) be the value at time s of a monetary unit at 
time f; the actuarial value at time 0 of these benefits, I1(0, @), is given by: 


I1(O, @) = bya11 (0, @) + b2a12(0, «), (3) 


where: ajj(t,u) = >) Pij(t,s)v(s, t) foralli, j € 1, 2. Assuming the equivalence 


principle, the gross single premium paid in t = 0, H” is defined as: 


I1(0, 
er. ar (4) 
l1-a—B—y [a100, ©) + 42100, @)] 
where a, f and y represent the premium loadings for acquisition, premium earned 
and general expenses, respectively. 


3 Demographic scenarios 


Long-term covers, such as the EPs, are affected by demographic trends (mortality and 
disability). A risk source in actuarial evaluations is the uncertainty in future mortality 
and disability; to represent such an uncertainty we adopt different projected scenarios. 

We start from a basic scenario, Hz, defined according to the most recent statistical 
data about people reporting disability (see ISTAT [3]) and, consistent with this data, 
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the Italian Life-Table SIM-1999. Actives’ mortality “13(¢) is approximated by the 
Weibull law, while transition intensities ~12(t) are approximated by the Gompertz law 
(for details about transition intensities’ estimation see Levantesi and Menzietti [5]). 
Disabled mortality intensity 223(t) is expressed in terms of j13(t) according to the 
time-dependent coefficient K (t), w23(t) = K (t)13(t). Values of K (t), coming from 
the experience data of an important reinsurance company, are well approximated by 
the function exp(co + cit + ct). 

Mortality of projected scenarios has been modelled evaluating a different set of 
Weibull parameters (a, £) for each ISTAT projection (low, main and high hypoth- 
esis, see ISTAT [4]). Furthermore, the coefficient K(f) is supposed to be the same 
for all scenarios. Regarding transition intensity, “12(t), three different sets of Gom- 
pertz parameters have been defined starting from a basic scenario to represent a 40% 
decrease (Hp. a), a 10% decrease (Hp. b) and a 20% increase (Hp. c) in disability 
trend, respectively. By combining mortality and disability projections we obtain nine 
scenarios. 

We assume that possible changes in demographic scenarios occur every k years, 

e.g., in numerical implementation 5 years is considered a reasonable time to capture 
demographic changes. Let H(t) be the scenario occurring at time f (t = 0, k, 2k, ...). 
It is modelled as a time-discrete stochastic process. Let P(t) be the vector of scenario 
probabilities at time t and M(t) the matrix of scenario transition probabilities between 
t and t + k. The following equation holds: P(t +k) = P(t) - M(¢). 
We suppose that at initial time the occurring scenario is the central one. We assume that 
the stochastic process H(t) is time homogeneous (M(t) = M, Vr) and the scenario 
probability distribution, P(t), is stationary after the first period, so that P(t) = P, 
Vt > k. Note that P is the left eigenvector of the transition matrix M corresponding 
to the eigenvalue 1. Values of P are assigned assuming the greatest probability of 
occurrence for the central scenario and a correlation coefficient between mortality 
and disability equal to 75%: 


P = (0.01 0.03 0.16 0.03 0.54 0.03 0.16 0.03 0.01). 


Further, we assume that transitions between strongly different scenarios are not pos- 
sible in a single period and consistently with supposed correlation between mortality 
and disability, some transitions are more likely than others. 

Resulting scenarios’ transition probabilities are reported in the matrix below. 


0.1650 0.1775 0.0000 0.1775 0.4800 0.0000 0.0000 0.0000 0.0000 
0.0492 0.1850 0.0933 0.0592 0.5100 0.1033 0.0000 0.0000 0.0000 
0.0000 0.0100 0.4250 0.0000 0.5550 0.0100 0.0000 0.0000 0.0000 
0.0492 0.0592 0.0000 0.1850 0.5100 0.0000 0.0933 0.1033 0.0000 
M = | 0.0100 0.0300 0.1600 0.0300 0.5400 0.0300 0.1600 0.0300 0.0100 
0.0000 0.1033 0.0933 0.0000 0.5100 0.1850 0.0000 0.0592 0.0492 
0.0000 0.0000 0.0000 0.0100 0.5550 0.0000 0.4250 0.0100 0.0000 
0.0000 0.0000 0.0000 0.1033 0.5100 0.0592 0.0933 0.1850 0.0492 
0.0000 0.0000 0.0000 0.0000 0.4800 0.1775 0.0000 0.1775 0.1650 
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4 A risk theory model 


Demographic risk analysis is carried out on a portfolio of EPs with Nj (t) contracts 
in state i at time f, closed to new entries. The random risk reserve is adopted as 
risk measure. It represents the insurer’s ability to meet liabilities, therefore it can 
be considered a valid tool to evaluate the insurance company solvency and, more 
generally, in the risk management assessment. Let U(0) be the value of the risk 
reserve at time 0; the risk reserve at the end of year ¢ is defined as: 


U (t) =U (t—1)+ P! (t) +. J (t) — E(t) — B(t) — AV(t) — K(t),_ (5) 


where: 


P' (t) is the gross single premiums income; 
J(t) are the investment returns on assets, A(t), where the assets are defined as 
A(t) = A(t — 1) + P7(t) — E(t) — B(t) + J(t) — K(t); 
E(t) are the expenses: E(t) = Di=12 Ni(t — le; (t); 
B(t) is the outcome for benefits: B(t) = >7;_1 » Ni(t — Ibi; 
AV(t) is the annual increment in technical provision, Vit) = Myer 2 MOVIO, 
and V;(t) is the technical provision for an insured in state i; 

e K(t) are the capital flows; if K(¢) > O the insurance company distributes divi- 
dends and if K (t) < 0 stockholders invest capital. 


We assume that premiums, benefits, expenses and capital flows are paid at the begin- 
ning of each year. To compare outputs of different scenarios and portfolios we use 
the ratio between risk reserve and total single premium income 


U(t) 


“O= Tam O 


The risk analysis is performed according to a multiple scenarios approach that con- 
siders each scenario as a possible state of the stochastic process H(t), according to 
the probability vector P, allowing evaluation of the risk of systematic deviations in 
biometric functions (see Olivieri and Pitacco [6] and Levantesi and Menzietti [5]). 

The demographic pricing basis is defined according to the central scenario with a 
safety loading given by a reduction of death probabilities. We disregard financial risk, 
adopting a deterministic and constant interest rate. We assume a financial pricing basis 
equal to the real-world one. Technical provision is reviewed every 5 years consistently 
with the scenario change period. Further, the insurance company perceives scenario 
changes with a delay of one period. 


4.1 Risk-based capital requirements 


Risk-based capital (RBC) is a method for assessing the solvency of an insurance 
company; it consists in computing capital requirements that reflect the size of overall 
risk exposures of an insurer. Let us consider RBC requirements based on risk reserve 
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distribution. We calculate RBC requirements with different time horizons and confi- 
dence levels. Let us define the finite time ruin probability as the probability of being 
in a ruin state in at least one of the time points 1, 2..., T, fora given U(O) = u: 


T 
w,,(0, n=1-rr{Ave = ou @ =a} (6) 


t=1 


RBC requirements for the time horizon (0, T) with a (1 — €) confidence level are 
defined as follows: 


RBC\_<(0, T) = inf {u©) > 0|¥o(0, T) < c| : (7) 


Note that the risk reserve must be not negative for all t € (0, T). 

To make data comparable, results are expressed as a ratio between RBC require- 
ments and total single premium income 
RB Cl-e (0, T) 
T1(0, @)Ni() ” 
An alternative method to calculate RBC requirements is based on the Value-at-Risk 
(VaR) of the U-distribution in the time horizon (0, T) witha (1 — €) confidence level: 
VaR\_,. (0, T) = —U;(T), where U; (t) is the €-th quantile of the U-distribution at 
time t. Hence RBC requirements are given by: 


RBC{{%(0, T) = VaR\~c(0, T) v0, T). (8) 


rbc\—¢(0, T) = 


If an initial capital U (0) is given, the RBC tae (0, T) requirements increase by the 


amount U (0). Values are reported in relative terms as 


RBC{“R(0, T) 


€ 


VaR = 
rbc (0,7) = TiO. a) i) @)Ni(0) 


l-e 


5 Profit analysis 


In this section we analyse the annual profit, Y(t), emerging from the management 
of the portfolio. In order to capture the profit sources, Y(t) can be broken down into 
insurance profit, Y/(t), and profit coming from investment income on shareholders’ 
funds (which we call “‘patrimonial profit”), Y ? (t). 


¥4M=(1+i(t-1,))[V-1)+P7®)-E()-BOI-V@) (9) 
y?(t) = U(t — Lit — 1,1) (10) 


The following relation holds: Y(t) = Y/(t)+Y?(t). The sequence {Y (t)},>1 18 called 
profit profile. Let p be the rate of return on capital required by the shareholders; the 
present value of future profits discounted at rate p (with p > i), Y(0, T) is given by: 


T 
Y(0,T) = YY vp, t), (11) 


t=1 
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while Y'(0, T) = S7/_, Y/(1)v, (0, t) and Y? 0, T) = SL, Y? (vp, £) are the 
present value of the future insurance and patrimonial profits, respectively. 


6 Portfolio simulation results 


Let us consider a cohort of 1000 policyholders, males, with the same age at policy 
issue, x = 65, same year of entry (time 0), a maximum policy duration @ = 49, 
expense loadings a = 5%, 6 = 2%, y = 0.7% and a constant interest rate i(0, tf) = 
i = 3% Vt. The annual benefit amounts are distributed as in Table 1. 


Table 1. Annual benefit amounts distribution (euros) 


by by fr(%) 
6,000 12,000 40 
9,000 18,000 30 
12,000 24,000 15 
15,000 27,000 10 
18,000 30,000 5 


Results of 100,000 simulations are reported in the following tables, assuming a 
safety loading on demographic pricing bases given by a 10% reduction of healthy and 
disabled death probabilities and an initial capital K (0) = RBCo9,5%(0, 1). Simulated 
values of u(t) are shown in Figure 1. The figure highlights the strong variability of the 
risk reserve distribution, especially when t > 5, as a consequence of demographic 
scenario changes. Even though the risk reserve has a positive trend due to safety 
loading, lower percentiles are negative. Economic consequences of such an aspect are 


Fig. 1. u(t) with safety loading = 10% reduction of death probabilities, initial capital K (0) = 
RBCo9.5%(0, 1) 
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Table 2. Moments of u(t) and the finite time ruin probability with initial capital K(O) = 
RBCo9.5%(0, 1), safety loading = 10% reduction of death probabilities 


u(T) T=1 T=5 T =10 T =20 T =30 
Mean (%) 0.79 1.51 2.75 6.61 10.94 
Std Dev (%) 0.37 1.26 4.58 5.27 6.36 
Coeff Var 0.4673 0.8372 1.6614 0.7974 0.5817 
Skew 0.4495 0.2867 0.0010 —0.0194 —0.0002 
W,,(0, T)(%) 0.50 15.06 28.98 40.67 40.84 


relevant for the insurer solvency and will be quantified through solvency requirements. 
In Table 2 we report the values of the u(t) moments and the coefficient of variation as 
well as the finite time ruin probability. It can be noticed that expected values of u(t) 
are always positive and increase with time as well as the standard deviation. Looking 
at the coefficient of variation we observe an increase of relative variability up to 
t = 10; thereafter it decreases. Such a behaviour demonstrates that demographic risk is 
mainly caused by the scenario changes (perceived with a delay of 5 years) affecting the 
evaluation of technical provisions. When technical provisions decrease, the coefficient 
of variation of u(t) becomes steady. The risk tendency to become stable is confirmed 
by the finite time ruin probability values that increase with time. As expected, ?,,(0, 1) 
is consistent with the initial capital provision, K (0) = RBCo9.5%(0, 1). 

Table 3 shows the values of RBC requirements for three different confidence 
levels: 98%, 99% and 99.5%. RBC values rise with time and become steady in T = 20 
only if RBC is computed on a time horizon (0, 7’), rather than at time 7. On the other 
hand, if we look at RBC computed according to VaR, we obtain lower values with 
respect to the previous ones, especially for T > 10. Results show that the initial 
capital should be increased by about 6% of the single premium income to guarantee 
the insurance solvency on the portfolio time horizon. 


Table 3. Risk-based capital with safety loading = 10% reduction of death probabilities, initial 
capital K (0) = RBCo9,5%(0, 1) 


rbcj—¢(0, T) T=1 T=5 T = 10 T = 20 T = 30 
€ = 0.5% 0.67% 1.78% 6.52% 6.57% 6.57% 
€ = 1.0% 0.62% 1.62% 6.20% 6.24% 6.24% 
€ = 2.0% 0.55% 1.43% 5.91% 5.94% 5.94% 
cl "ROT) T=1 T=5 T =10 T = 20 T = 30 
€ = 0.5% 0.67% 1.77% 6.03% 4.07% 2.63% 
€ = 1.0% 0.62% 1.60% 5.75% 3.61% 2.12% 


€ =2.0% 0.55% 1.39% 5.30% 3.01% 1.48% 
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Fig. 2. Expected value of annual insurance and patrimonial profit with safety loading = 10% 


reduction of death probabilities, initial capital K (0) = RBCo9,5%(0, 1) 


Table 4. Moments of u(t) and the finite time ruin probability with initial capital K(0) = 0, 
safety loading = 10% reduction of death probabilities 


u(T) T=1 T=5 T =10 T = 20 T =30 
Mean (%) 0.10 0.73 1.85 5.40 9.31 

Std Dev (%) 0.37 1.26 4.58 5.27 6.36 
Skew 0.4495 0.2867 0.0010 —0.0194 —0.0002 
W,(0, T) (%) 42.06 58.42 62.37 67.72 67.79 


Figure 2 shows the expected values of annual profit components as stated in 
Section 5. The insurance profit line shows greater variability, being affected by de- 
mographic risks. Meanwhile, the patrimonial profit line is more regular due to the 
absence of financial risk, and increases with time, depending on investments of risk 
reserve (return produced by the investment of risk reserve). 

In order to evaluate the effect of different initial capital provisions, we fix 
K (0) = O. Further, according to ISVAP (the Italian insurance supervisory author- 
ity), which shares the minimum solvency margin in life insurance to face demo- 
graphic and financial risk in 1% and 3% of technical provisions, respectively, we fix 
K(0) = 1%V(0T). The moments of u(t) distribution and the ruin probability are 
reported in Tables 4 and 5. They can be compared with the results of Table 2. 

Note that K (0) values do not affect the standard deviation and skewness of u(t) 
distribution, while they do influence the u(t) expected value, which increases when 
K (0) rises. Now, let us consider the highest safety loading given by a 20% reduction 
of death probabilities for both healthy and disabled people. Values of the moments 
of u(t) and finite time ruin probabilities are reported in Table 6. If we compare these 
values with the ones in Table 2 (where the safety loading is equal to a 10% reduction 
of death probabilities), we find that safety loading strongly affects the expected values 
of u(t), but does not significantly affect the standard deviation and skewness. In other 
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Table 5. Moments of u(t) and the finite time ruin probability with initial capital K(O) = 
1%V(0*), safety loading = 10% reduction of death probabilities 


u(T) (oe | T=5 T = 10 T=20 T = 30 
Mean (%) 1.13 1.89 3.20 TOL 11.74 
Std Dev (%) 0.37 1.26 4.58 5.27 6.36 
Skew 0.4495 0.2867 0.0010 —0.0194 —0.0002 
(0, T) (%) 0.00 7.02 24.18 36.69 36.82 


Table 6. Moments of capital ratio and the finite time ruin probability with initial capital K (0) = 
RBCo9.5%(0, 1), safety loading = 20% reduction of death probabilities 


u(T) T=1 T=5 T= 10 T = 20 T = 30 
Mean (%) 0.78 2.07 4.49 12.01 20.39 
Std Dev (%) 0.37 1.26 4.50 5.25 6.20 
Skew 0.4544 0.2892 0.0120 —0.0197 —0.0038 
Y.(0, T) (%) 0.50 7.07 22.21 30.56 30.56 


Table 7. Risk-based capital with safety loading = 20% reduction of death probabilities, initial 
capital K (0) = RBCo9.5~%(0, 1) 


rbci—¢(0, T) T=1 T=5 T =10 T =20 T =30 
€ = 0.5% 0.57% 1.23% 5.39% 5.39% 5.39% 
€ = 1.0% 0.52% 1.09% 5.14% 5.14% 5.14% 
€ = 2.0% 0.46% 0.92% 4.88% 4.88% 4.88% 
rbc/ “8 (0,T) Foi PS5 E10 T = 20 T = 30 
€ = 0.5% 0.57% 1.18% 4.52% 0.97% -1.52% 
€ = 1.0% 0.52% 1.02% 4.24% 0.49% —2.02% 
€ = 2.0% 0.46% 0.81% 3.78% —0.12% —2.65% 


words, safety loading reduces the probability of risk reserve to become negative (as 
proved by the '¥,,(0, 7) values), but does not lower its variability. Moreover, required 
capital decreases with safety loading increase (see Table 7 compared with Table 3). 
Note that in the long term the rbcV* are negative, therefore an initial capital is not 
necessary to guarantee the insurance solvency. Nonetheless, the requirement reduction 
is financed by the policyholders through a premium increase — due to a higher safety 
loading — making the insurance company less competitive on the market. Therefore, 
it is important to combine a solvency target with commercial policies. 

Now, let us consider the safety loading impact on Y(0, T) discounted at a rate 
p = 5% > i (see (11)). As expected, Y (0, T) rises with an increase of safety loading: 
the values — expressed as a ratio to total single premium income — move from 6.21% 
to 11.85% when the safety loading rises. It is worth noting that looking at the profit 
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Table 8. Present value of future profits as ratio to total single premium income, 
Y (0, T)/(I1(0, w) Nj (0)) with initial capital K (0) = RBCo9.5%(0, 1) 


SL= 10% SL = 20% A% 
Total 6.21% 11.85% 91% 
Insurance 3.03% 6.17% 104% 
Patrimonial 3.18% 5.69% 19% 


sources separately, we have a higher increase in Y/(0, T) than in ¥Y? (0, T): 104% 
compared to 79%. 


7 Conclusions 


This paper focuses on disability and longevity risks arising from issues in the estimate 
of residual life expectancy of healthy and disabled people. Our analysis highlights 
that the EPs are affected by a significant demographic risk caused by systematic 
deviations between expected and realised demographic scenarios. The results confirm 
that such a risk is difficult to control, depending on uncertainty in the future evolution 
of biometric functions. The risk reserve distribution shows a strong variability due to 
the demographic scenario changes affecting the technical provision valuation. Since 
such arisk is systematic, the u(t) variability does not lessen when either safety loading 
or initial capital increase. Nonetheless, they are useful tools in managing demographic 
risk because they significantly reduce the ruin probability of the insurance company 
as far as the RBC requirements necessary to ensure the insurer solvency at a fixed 
confidence level. In this paper we take into account an initial capital only, reducing the 
probability of incurring losses. However, the risk of systematic deviations persists, 
requiring an appropriate capital allocation strategy. This topic will be the subject of 
future research together with a suitable reinsurance strategy. 
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Clustering mutual funds by return and risk levels 


Francesco Lisi and Edoardo Otranto 


Abstract. Mutual funds classifications, often made by rating agencies, are very common 
and sometimes criticised. In this work, a three-step statistical procedure for mutual funds 
classification is proposed. In the first step fund time series are characterised in terms of returns. 
In the second step, a clustering analysis is performed in order to obtain classes of homogeneous 
funds with respect to the risk levels. In particular, the risk is defined starting from an Asymmetric 
Threshold-GARCH model aimed to describe minimum, normal and turmoil risk. The third 
step merges the previous two. An application to 75 European funds belonging to 5 different 
categories is presented. 


Key words: clustering, GARCH models, financial risk 


1 Introduction 


The number of mutual funds has grown dramatically over recent years. This has led to 
a number of classification schemes that should give reliable information to investors 
on features and performance of funds. Most of these classifications are produced by 
national or international rating agencies. For example, Morningstar groups funds into 
categories according to their actual investment style, portfolio composition, capitali- 
sation, growth prospects, etc. This information is then used, together with that related 
to returns, risks and costs, to set up a more concise classification commonly referred 
to as Star Rating (see [11] for details). Actually, each rating agency has a specific 
owner evaluation method and also national associations of mutual funds managers 
keep and publish their own classifications. 

Problems arise as, in general, classes of different classifications do not coincide. 
Also, all classification procedures have some drawback; for example, they are often 
based on subjective information and require long elaboration time (see, for example, 
[15]). 

In the statistical literature, classification of financial time series has received rel- 
atively little attention. In addition, to the best of our knowledge, there are no compar- 
isons between different proposed classifications and those of the rating agencies. Some 
authors use only returns for grouping financial time series. For example, [15] propose 
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a classification scheme that combines different statistical methodologies (principal 
component analysis, clustering analysis, Sharpe’s constrained regression) applied on 
past returns of the time series. Also, the clustering algorithm proposed by [9], re- 
ferring to different kinds of functions, is based only on return levels. Other authors 
based their classifications only on risk and grouped the assets according to the dis- 
tance between volatility models for financial time series [2, 8, 12-14]. Risk-adjusted 
returns, i.e., returns standardised through standard deviation, are used for clustering 
time series by [4]. This approach is interesting, but using the unconditional variance 
as a measure of risk and ignoring the dynamics of volatility seems too simplistic. 

In this paper, a classification based only on the information contained in the net 
asset value (NAV) time series is considered. It rests on the simple and largely agreed 
idea that two very important points in evaluation of funds are return and risk levels. 
In order to measure the return level, the mean annual net period return is considered. 
As regards the riskiness, in the time series literature, it is commonly measured in 
terms of conditional variance (volatility) of a time series. As is well known, volatility 
is characterised by a time-varying behaviour and clustering effects, which imply 
that quiet (low volatility) and turmoil (high volatility) periods alternate. In order to 
account both for the time-varying nature of volatility and for its different behaviour in 
quiet and turmoil periods, an asymmetric version of the standard Threshold GARCH 
model [5, 17], is considered in this work. 

The whole classification scheme consists of three steps: the first groups funds 
with respect to returns whereas the second groups them with respect to riskiness. In 
particular, the whole risk is broken down into constant minimum risk, time-varying 
standard risk and time-varying turmoil risk. Following [12, 13] and [14], the clustering 
related to volatility is based on a distance between GARCH models, which is an 
extension of the AR metric introduced by [16]. Lastly, the third step merges the 
results of the first two steps to obtain a concise classification. 

The method is applied to 75 funds belonging to five categories: aggressive bal- 
anced funds, prudential balanced funds, corporate bond investments, large capital- 
isation stock funds and monetary funds. In order to make a comparison with the 
classification implied by the Morningstar Star Rating, which ranges from | to 5 stars, 
our clustering is based on 5 “stars” as well. As expected, our classification does not 
coincide with the Morningstar Rating because it is only partially based on the same 
criteria. Nevertheless, in more than 82% of the considered funds the two ratings do 
not differ for more than one star. 

The paper is organised as follows. Section 2 describes how the risk is defined. 
Section 3 contains an application and the comparison of our clustering with the Morn- 
ingstar Rating classification. Section 4 concludes. 


2 Risk modelling 


In this section the reference framework for fund riskiness modelling is described. Let 
y, be the time series of the NAV of a fund and r; the corresponding log-return time 
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series. We suppose that the return dynamics can be described by the following model: 


re = Me + Op = ee thy, t=1,...,T 
(1) 
et | T-1 oe N(O, hy), 


where “; = E;—1(7;) is the conditonal expectation and u; is an 1.i.d. zero-mean 
and unit variance innovation. The conditional variance h, follows an asymmetric 
version of the Threshold GARCH(1,1) process [5,17], which stresses the possibility 
of a different volatility behaviour in correspondence with high negative shocks. We 
refer to it as the Asymmetric Threshold GARCH (AT-GARCH) model. Formally, the 
conditional variance can be described as: 
hy=yt ae? 4 + Bhy-1 + OS 16 r 4 
2 
1 if oe <é} @) 


= 0 otherwise 


where y , a, , 6 are unknown parameters, whereas ¢* is a threshold identifying the 
turmoil state. The value of ¢* could represent a parameter to be estimated, but in 
this work we set it equal to the first decile of the empirical distribution of ¢. On the 
whole, this choice maximises the likelihood and the number of significant estimates 
of 0. Also, the first decile seems suitable because it provides, through the parameter 0, 
the change in the volatility dynamics when high — but not extreme — negative returns 
occur. 

The purpose of this work is to classify funds in terms of gain and risk. While the 
net period return is the most common measure of gain, several possible risk measures 
are used in the literature. However, most of them look at specific aspects of riskiness: 
standard deviation gives a medium constant measure; Value-at-Risk tries to estimate 
an extreme risk; the time-varying conditional variance in a standard GARCH model 
focuses on the time-varying risk, and so on. 

In this paper we make an effort to jointly look at risk from different points of view. 
To do this, following [13], we consider the squared disturbances ¢2 as a proxy of the 
instantaneous volatility of r;. It is well known that en is a conditionally unbiased, 
but very noisy, estimator of the conditional variance and that realised volatility and 
intra-daily range are, in general, better estimators [1, 3, 10]. However, the adoption 
of EP in our framework is justified by practical motivations because intra-daily data 
are not available for mutual funds time series and, thus, realised volatility or range 
are not feasible. Starting from (2), after simple algebra, it can be shown that, for an 
AT-GARCH(1,1), ¢? follows the ARMA(1, 1) model: 


ee=yt+ (a + 6S;-; + B) ce ,—-8B (7-1 = hit) + (<7 - hi) F (3) 


where (e? — h;) are uncorrelated, zero-mean errors. 
The AR(oo) representation of (3) is: 


Co 
y — 
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from which it is easy to derive the expected value at time ¢ given past information 
[o.@) 
E:-i@;) = 7G | a + 58,1) Bo" ef ;. (5) 


This representation splits the expected volatility, Ej 1(6?), considered as a whole 
measure of risk, into three positive parts: a constant part, y /(1 — £), representing the 
minimum risk level which can be reached given the model; the time- pee standard 
risk Oe apse? j) and the time-varying turmoil risk ee 0S;-; Bi! _;), the 
last two being dependent on past information. Of course, the estimation of Soren 
(5) requires a finite truncation. 

In order to classify funds with respect to all three risk components, we propose 
considering the distance between an homoskedastic model and a GARCH(1,1) model. 
Using the metric introduced by [12] and re-considered by [14], in the case of speci- 
fication (2) this distance is given by: 


act OSt—1 
Jd = B) 


The previous analytical formulation allows us to provide a vectorial description of the 
risk of each fund. In particular, we characterise the minimum constant risk through 
the distance between the zero-risk case (y = a = 6 = 6 = 0) and thea =d=0 
case 


(6) 


y 

= —_., 7 
Om = B (7) 
The time-varying standard risk is represented, instead, by the distance between a 
GARCH(1,1) model (6 = 0) and the corresponding homoskedastic model (a = 6 = 

6=0) 
a 

v(1— f?) 
Lastly, the turmoil risk is described by the difference of the distance between an 
AT-GARCH model, and the homoskedastic model and the distance measured by (8): 


(8) 


vs = 


O 
Jd = 2) 


The whole risk is then characterised by the vector [v», vs, 0; ]'. If an extra element, 
accounting for the return level, 7, is considered, each fund may be featured by the 
vector: 


Or = 


(9) 


f = Ir, Um), Vs» vr)’. 


In order to obtain groups of funds with similar return and risk levels, some clustering 
algorithm can be easily applied directly to f or to some function of the elements of f. 
For example, in the next section risk will be defined as the average of v0), vs and v;. 
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3 An application 


As an application of the previously described procedure, the daily time series of NAV 
of 75 funds of the Euro area and belonging to five different categories were consid- 
ered. The five typologies are the aggressive balanced, prudential balanced, corporate 
bond investments, large capitalisation stock and monetary funds. Data, provided by 
Bloomberg, range from 1/1/2002 to 18/2/2008, for a total of 1601 observations for 
each series. 

Our experiment consists in providing a classification of these funds, characterising 
each group in terms of return and riskiness (following the definitions of constant 
minimum, time-varying standard and time-varying turmoil risk) and comparing our 
classification with that produced by the Morningstar star rating. 

For each fund the return time series was considered and for each calendar year 
the net percentage return was computed; finally the average of the one-year returns, 
r, was used to represent the gain. 

To describe riskiness, first model (1)—(2) was estimated for each fund. When 
parameters were not significant at the 5% level, they were set equal to zero and 
the corresponding constrained model was estimated. Of course, before accepting the 
model the absence of residual ARCH effects in the standardised residuals was checked. 
Parameter estimation allowed us to calculate the risks defined as in (7), (8) and (9) 
and to characterise the funds by the elements 7, 0, Ds, 0; or by some functions of 
them. 

With these vectors a clustering analysis was performed. In the clustering, a clas- 
sical hierarchical algorithm with the Euclidean distance was used, whereas distances 
between clusters are calculated following the average-linkage criterion (see, for ex- 
ample, [7]).! In particular, the classification procedure followed three steps: 


1. The series were classified into three groups, referring only to the degree of gain, 
i.e., 7 low, medium and high. 

2. The series were classified into three groups only with respect to the degree of risk 
(low, medium and high). To summarise the different kinds of risk, the average 
of the three standardised risks was computed for each series. Standardisation is 
important because of the different magnitudes of risks; for example, minimum risk 
generally has an order of magnitude lower than that of the other two risks. 

3. The previous two classifications were merged, combining the degree or gain and 
risk so as to obtain a rating from | to 5 “stars”; in particular, denoting with h, m 
and / the high, medium and low levels respectively and with the couple (a, b) the 
levels of gain and risk (with a, b = h, m, 1), stars were assigned in the following 
way: 

1 star for (/, h) (low gain and high risk); 

2 stars for (/, m), (/, 1) ow gain and medium risk, low gain and low risk); 

3 stars for (m, h), (m, m), (h, h) (medium gain and high risk, medium gain and 
medium risk, high gain and high risk); 


! The clustering was performed also using the Manhattan distance and the single-linkage and 
complete-linkage criteria. Results are very similar. 
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4 stars for (m, 1), (4, m) (medium gain and low risk, high gain and medium risk); 
5 stars for (A, J) (high gain and low risk). 

Of course, this is a subjective definition of stars, nevertheless it seemed reasonable 
to us. 


As in [13], the quality of the clustering was measured using the C-index [6]. This 
index assumes values in the interval [0, 1], assuming small values when the quality 
of the clustering is good. In our experiments, we always obtained C < 0.1. 

Table | lists the step-by-step results of the classification procedure for the group 
of monetary funds. The left part of the table shows the classification based on the risk 
evaluation and the global rating provided by Morningstar. The central part lists the 
elements characterising the funds (one-year average return, constant minimum, time- 
varying standard and time-varying turmoil risks). Note that vo, assumes very small 
values (due to the small values of 7) and that only the last fund presents a turmoil risk.” 
The right part of the table shows the results of the three-step classification procedure. 
The Gain column contains the classification in high, medium and low gain obtained 
by the clustering of step 1; the Risk column contains the classification in high, medium 
and low risk obtained by the grouping of step 2; lastly, the Stars column shows the 
five-group classification described in step 3. 

The differences with respect to the Morningstar rating are not large: the classifi- 
cation is the same in 8 cases over 15, in 6 cases it does not differ for more than one 
star and only in one case (the 14th fund) the two classifications differ for 2 stars. 


Table 1. Monetary funds: Morningstar classification and details of the clustering procedure 


Risk Stars i vt Gain Risk Stars 
Low 3 4.77E-09 0 0 Medium Low 4 
Below average 5 0 0.087 0 High Low 5 
Below average 3 7.01E-08 0.171 0 Low Medium 2 
Below average 3 5.71E-08 0 0 Medium Low 4 
Below average 3 0 0.180 0 Medium Medium 3 
Below average 3 0 0.231 0 Medium Medium 3 
Average 2 1.93E-07 0 0 Low Low 2 
Average 2 0 0.144 0 Low Medium 2 
Average 4 0 0.208 0 Medium Medium 3 
Above average 2 0 0.155 0 Low Medium 2 
Above average 4 1.91E-07 0.385 0 High High 3 
Above average 2 0 0.234 0 Low Medium 2 
Above average 3 0 0.145 0 Low Medium 2 
High 5 1.29E-06 0 0 Medium Medium 3 
High 1 0 0.151 0.333 Low High 1 


2 On the whole, instead, parameter 6 was significant in 11 cases (about 14% of funds). 
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Table 2. Comparison of Morningstar and Clustering Classification 


Differences 
in stars 
= ok 2 2 
Aggr. Bal. Clustering 7 7 6 2 
Morningstar 8 
Prud. Bal. Clustering 4 9 2 
Morningstar 
Corp. Bond Clustering 3 5 5 2 
Morningstar 10 
Stock Clustering 4 10 1 
Morningstar 
Monetary Clustering 8 6 1 
Morningstar 


Table 3. Empirical probability and cumulative distribution functions of differences in stars 
(percentages) 


Empirical probability function 


0 i 2 3 4 5 
34.7 48.0 14.7 2.6 0.0 0.0 
Empirical cumulative distribution function 

0 1 2 3 4 5 
34.7 82.7 97.4 100 100 100 


The same procedure was applied to the other four categories and results are sum- 
marised and compared with the Morningstar classification in Table 2. Clearly, the 
classifications are different because they are based on different criteria and defini- 
tions of gain and risk. However, in 82.7% of cases the two classifications do not differ 
for more than one star. This is evident looking at Table 3, in which the empirical 
probability function of the differences in stars and the corresponding cumulative dis- 
tribution function are shown. Moreover, excluding the Corporate Bond Investments, 
which present the largest differences between the two classifications, the percentage 
of differences equal to or less than | increases up to 90% while the remaining 10% dif- 
fers by two stars. In particular, the classifications relative to the Aggressive Balanced 
and the Monetary funds seem very similar between the two methodologies. 


4 Some concluding remarks 
In this paper a clustering procedure to classify mutual funds in terms of gain and 


risk has been proposed. It refers to a purely statistical approach, based on few tools 
to characterise return and risk. The method is model-based, in the sense that the 
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definition of risk is linked to the estimation of a particular Threshold GARCH model, 
which characterises quiet and turmoil states of financial markets. 

The risk is evaluated simply considering an equally weighted average of three 
different kinds of risk (constant minimum risk, time-varying standard risk and time- 
varying turmoil risk). Different weights could also be considered but at the cost of 
introducing a subjectivity element. 

Surprisingly, in our application, this simple method provided aclassification which 
does not show large differences with respect to the Morningstar classification. Of 
course, this exercise could be extended to compare our clustering method with other 
alternative classifications and to consider different weighting systems. For example, 
it would be interesting to link weights to some financial variable. As regards applica- 
tions, instead, the main interest focuses on using this approach in asset allocation or 
portfolio selection problems. 
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Multivariate Variance Gamma and Gaussian 
dependence: a study with copulas* 


Elisa Luciano and Patrizia Semeraro 


Abstract. This paper explores the dynamic dependence properties of a Lévy process, the 
Variance Gamma, which has non-Gaussian marginal features and non-Gaussian dependence. 
By computing the distance between the Gaussian copula and the actual one, we show that even 
a non-Gaussian process, such as the Variance Gamma, can “converge” to linear dependence 
over time. Empirical versions of different dependence measures confirm the result over major 
stock indices data. 


Key words: multivariate variance Gamma, Lévy process, copulas, non-linear dependence 


1 Introduction 


Risk measures and the current evolution of financial markets have spurred the interest 
of the financial community towards models of asset prices which present both non- 
Gaussian marginal behaviour and non-Gaussian, or non-linear, dependence. When 
choosing from the available menu of these processes, one looks for parsimoniousness 
of parameters, good fit of market data and, possibly, ability to capture their dependence 
and the evolution of the latter over time. It is difficult to encapsulate all of these 
features — dynamic dependence, in particular — in a single model. The present paper 
studies an extension of the popular Variance Gamma (VG) model, named a-VG, 
which has non-Gaussian features both at the marginal and joint level, while succeeding 
in being both parsimonious and accurate in data fitting. We show that dependence 
“converges” towards linear dependence over time. This represents good news for 
empirical applications, since over long horizons one can rely on standard dependence 
measures, such as the linear correlation coefficient, as well as on a standard analytical 
copula or dependence function, namely the Gaussian one, even starting from data 
which do not present the standard Gaussian features of the Black Scholes or log- 
normal model. Let us put the model in the appropriate context first and then outline 
the difficulties in copula towards dynamic dependence description then. 


* © 2008 by Elisa Luciano and Patrizia Semeraro. Any opinions expressed here are those of 
the authors and not those of Collegio Carlo Alberto. 
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In the financial literature, different univariate Lévy processes — able to capture 
non-normality — have been applied in order to model stock returns (as a reference 
for Lévy processes, see for example Sato [10]). Their multivariate extensions are 
still under investigation and represent an open field of research. One of the most 
popular Lévy processes in Finance is the Variance Gamma introduced by Madan 
and Seneta [8]. A multivariate extension has been introduced by Madan and Seneta 
themselves. A generalisation of this multivariate process, named a-VG, has been 
introduced by Semeraro [11]. The generalisation is able to capture independence and 
to span a wide range of dependence. For fixed margins it also allows various levels 
of dependence to be modelled. This was impossible under the previous VG model. 
A thorough application to credit analysis is in Fiorani et al. [5]. The a-VG process 
depends on three parameters for each margin (uj, oj, @;) and an additional common 
parameter a. The linear correlation coefficient is known in closed formula and its 
expression is independent of time. It can be proved [7] that the process also has 
non-linear dependence. 

How can we study dynamic dependence of the a-VG process? Powerful tools 
to study non-linear dependence between random variables are copulas. In a seminal 
paper, Embrechts et al. [4] invoked their use to represent both linear and non-linear 
dependence. Copulas, which had been introduced in the late 1950s in statistics and 
had been used mostly by actuaries, do answer static dependence representation needs. 
However, they hardly cover all the dynamic representation issues in finance. For Lévy 
processes or the distributions they are generated from, the reason is that, for given 
infinitely divisible margins, the conditions that a copula has to satisfy in order to 
provide an infinitely divisible joint distribution are not known [3]. 

In contrast, if one starts from a multivariate stochastic process as a primitive entity, 
the corresponding copula seldom exists in closed form at every point in time. Indeed, 
copula knowledge at a single point in time does not help in representing dependence 
at later maturities. Apart from specific cases, such as the traditional Black Scholes 
process, the copula of the process is time dependent. And reconstructing it from the 
evolution equation of the underlying process is not an easy task. In order to describe 
the evolution of dependence over time we need a family of copulas {C;, t > 0}. Most 
of the time, as in the VG case, it is neither possible to derive C; from the expression 
of C; nor to get C; in closed form. However, via Sklar’s Theorem [12], a numerical 
version of the copula at any time t can be obtained. The latter argument, together with 
the fact that for the a-VG process the linear correlation is constant in time, leads us to 
compare the a-VG empirical copula for different tenures ¢ with the Gaussian closed 
form one. We study the evolution over time of the distance between the empirical 
and the Gaussian copula as a measure of the corresponding evolution of non-linear 
dependence. 

The paper is organised as follows: Section 2 reviews the VG model and its depen- 
dence; it illustrates how we reconstruct the empirical copula. Section 3 compares the 
approximating (analytical) and actual (numerical) copula, while Section 4 concludes. 
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2 VG models 


The VG univariate model for financial returns X (t) has been introduced by Madan 
and Seneta [8]. It is a natural candidate for exploring multivariate extensions of Lévy 
processes and copula identification problems outside the Black Scholes case for a 
number of reasons: 


e itcan be written as a time-changed Wiener process: its distribution at time ¢ can 
be obtained by conditioning; 

e itis one of the simplest Lévy processes that present non-Gaussian features at the 
marginal level, such as asymmetry and kurtosis; 

e there is a well developed tradition of risk measurement implementations for it. 


Formally, let us recall that the VG is a three-parameter Lévy process (uw, 0, a) 
with characteristic function 


1 a 
YxveW) = LWxvea Wl = (1 — iupa + 57°au") ; (1) 


The VG process has been generalised to the multivariate setting by Madan and 
Seneta themselves [8] and calibrated on data by Luciano and Schoutens [7]. This 
multivariate generalisation has some drawbacks: it cannot generate independence 
and it has a dependence structure determined by the marginal parameters, one of 
which (a) must be common to each marginal process. 

To overcome the problem, the multivariate VG process has been generalised to 
the a-VG process [11]. The latter can be obtained by time changing a multivariate 
Brownian motion with independent components by a multivariate subordinator with 
gamma margins. 

Let Y;, i= 1,...,nand Z be independent real gamma processes with parameters 
respectively 


1 | eee 
(— -a,—),i=l,...,n 
Gi Gi 


and (a, 1), where aj > 0 j = 1,...,n are real parameters anda < + Vi. The 
multivariate subordinator {G(t), tf > 0} is defined by the following 


G(t) = (Gi(t),..., Gat)? = (M1) + 1 Z(t), ---, Yat) FanZ(t))?. (2) 


Let W; be independent Brownian motions with drift ~; and variance o;. The R” 
valued process X = {X(t), t > 0} defined as: 


X(t) = (Wi(Gi(t)), .... Wn(Gn(t)))” (3) 


where G is independent from W, is an a-VG process. 
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It depends on three marginal parameters (uj, 0;, a ;) and an additional common 
parameter a. Its characteristic function is the following 


: 2a) ) 
yxu) = |] (1 — Gj (az - 7303) 


j=l 


n 
; 1 
1- 34; (issu _ 0703) . (4 
j=l 


The usual multivariate VG obtains fora; =a, j= 1,...,nanda= 4. 


For the sake of simplicity, from now on we consider the bivariate case. 

Since the marginal processes are VG, the corresponding distributions at time f, 
F} and F? can be obtained in a standard way, i.e., conditioning with respect to the 
marginal time change: 


: Fo (xi — Mi(wi + aiz)\ 
F, (xi) = [ ® — Set) (z)dz, (5) 


where © is a standard normal distribution function and fow is the density of a gamma 


ee ae 


distribution with parameters ( er 


) . The expression for the joint distribution at time 
a F; => Fx), is: 


ofS PSM (x1 — wi (wi + 42Z) X2 — M2 (w2 + Bz) 
Fi (x1, x2) = ® | —————_ } ® | — SS ]() 
0 Jo JO O1/Wi, + Q1Z 02./W2 + A2Z 
‘fy (1) fron) (2) fz @)dwidwodz, (7) 
where fy,(r), fy2(t), fz() are densities of gamma distributions with parameters re- 
spectively: (: (+ — a) ; mm) : (« (4 _ a) ; +) and (ta, 1) [11]. 
2.1 Dependence structure 


In this section, we investigate the dependence or association structure of the a-VG 
process. 

We know from Sklar’s Theorem that there exists a copula such that any joint 
distribution can be written in terms of the marginal ones: 


Fy (x1, x2) = Cr(F; (x1), F722). (8) 
The copula C; satisfies: 
C (uy, u2) = Fy (CF)! (ui), (F7)' (u2)), (9) 


where (F!)~! is the generalised inverse of F/, i = 1, 2. 
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Since the marginal and joint distributions in (5) and (6) cannot be written in 
closed form, the copula of the a-VG process and the ensuing non-linear dependence 
measures, such as Spearman’s rho and Kendall’s tau, cannot be obtained analytically. 

The only measure of dependence one can find in closed form is the linear corre- 
lation coefficient: 

X(t) _ LH1H2a1a2a 
J@? + Wai)(ox + H54;) 


This coefficient is independent of time, but depends on both the marginal and the 
common parameter a. For given marginal parameters the correlation is increasing in 
the parameter a if 44112 > O, as is the case in most financial applications. Since a 


has to satisfy the following bounds: 0 < a < min (4 a 


p (10) 


) the maximal correlation 


allowed by the model corresponds to a = min (4. 4). 

However, it can be proved that linear dependence is not exhaustive, since even 
when p = 0 the components of the process can be dependent [7]. In order to study 
the whole dependence we should evaluate empirical versions of the copula obtained 
from (9) using the integral expression of the marginal and joint distributions in (5) 
and (6). A possibility which is open to the researcher, in order to find numerically the 
copula of the process at time f, is then the following: 


fix a grid (uj, vj), i = 1,..., N on the square [0, 1: 

for each i = 1,..., N compute (F!)~! (ui) and (F7)—!(;) by numerical ap- 
proximation of the integral expression: let (F!)~! (uj) and (F2)~!(0;) be the 
numerical results; 

find a numerical approximation for the integral expression (6), let it be FE, (xi, Yi) 
find the approximated value of C;(u;, vj): 


C; (ui, 0) = FCF) 1), @)7'@)), f= 1,-..,.N. 


We name the copula C; numerical, empirical or actual copula of the a-VG dis- 
tribution at time f. 

In order to discuss the behaviour of non-linear dependence we compare the em- 
pirical copula and the Gaussian one with the same linear correlation coefficient, for 
different tenors t. We use the classical L! distance: 


1 
air.) = [ 1Cs(u, 0) = Chu, v)|dudo. a) 
0 


It is easy to demonstrate that the distance d is consistent with concordance order, 
i.e., C; < Ci ~ C/ implies d(C;, C/) < d(C, C’) [9]. It follows that the nearer the 
copulas are in terms of concordance, the nearer they are in terms of d;. Observe that 
the maximal distance between two copulas is 1 i.e., the distance between the upper 
and lower Fréchet bounds. 

Therefore for each t we: 


e fix the marginal parameters and a linear correlation coefficient; 
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find the numerical copula C; of the process over the prespecified grid; 

compute the distance between the numerical and Gaussian copula.! Please note 
that, since the linear correlation p*“ in (10) is independent of time, the Gaussian 
copula remains the same too: (C; = C/ in (11)). 


3 Empirical investigation 


3.1 Data 


The procedure outlined above has been applied to a sample of seven major stock 
indices: S&P 500, Nasdaq, CAC 40, FTSE 100, Nikkei 225, Dax and Hang Seng. For 
each index we estimated the marginal VG parameters under the risk neutral measure, 
using our knowledge of the (marginal) characteristic function, namely (4). From the 
characteristic function, call option theoretical prices were obtained using the Frac- 
tional Fast Fourier Transform (FRFT) in Chourdakis [2], which is more efficient than 
the standard Fast Fourier Transform (FFT). The data for the corresponding observed 
prices are Bloomberg quotes of the corresponding options with three months to ex- 
piry. For each index, six strikes (the closest to the initial price) were selected, and the 
corresponding option prices were monitored over a one-hundred-day window, from 
7/14/06 to 11/30/06. 


3.2 Selection of the a-VG parameters 


We estimated the marginal parameters as follows: using the six quotes of the first 
day only, we obtained the parameter values which minimised the mean square error 
between theoretical and observed prices, the theoretical ones being obtained by FRFT. 
We used the results as guess values for the second day, the second day results as guess 
values for the third day, and so on. The marginal parameters used here are the average 
of the estimates over the entire period. The previous procedure is intended to provide 
marginal parameters which are actually not dependent on an initial arbitrary guess 
and are representative of the corresponding stock index price, under the assumption 
that the latter is stationary over the whole time window. The marginal values for the 
VG processes are reported in Table 1. 

We performed our analysis using the marginal parameters reported above and the 
maximal correlation allowed by the model. The idea is indeed that positive and large 
dependence must be well described. For each pair of assets, Table 2 gives the maximal 
possible value of a, namely a = min{ z : +) (lower entry) and the corresponding 
correlation coefficient (upper entry), obtained using (10) in correspondence to the 
maximal a. 


! Since we have the empirical copula only ona grid we use the discrete version of the previous 
distance. 
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Table 1. Calibrated parameters for the a-VG price processes for the stock indices in the sample 


Asset i Mi 

S&P —0.65 
Nasdaq —0.67 
CAC 40 —0.46 
FTSE —0.59 
Nikkei —0.34 
DAX —0.27 
Hang Seng —1.68 


Oj 


0.22 
0.11 
0.10 
0.045 
0.16 
0.13 
0.8 


ai 


0.10 
0.13 
0.11 
0.031 
0.10 
0.14 
0.03 


Table 2. Maximal correlation and a-parameter (in parentheses) for the calibrated a-VG models, 


Nasdaq | 0.803 
=r | | 
CAC 40 | 0.795 | 0.701 

oN Joomlosn] | || 
FTSE | 0.505 | 0.410 | 0.406 

™ feomfesofoam| | | 


all stock indices 


Nikkei | 0.556 | 0.461 | 0.457 | 0.284 

PMS fessfrsm|soo}eso) | 
Dax 0.512 | 0.536 | 0.447 | 0.261 | 0.294 

a ee ee 


Hang Seng} 0.500 | 0.406 | 0.403 | 0.834 | 0.282 | 0.259 
(9.791)} (7.590) | (9.020) |(31.976)|(9.593)| (7.092) 


3.3. Copula results 


We computed the empirical copula C, for the following tenors: t = 0.1, 1, 10, 100. 
We report in Table 3 the distances d; corresponding to each pair of stocks and each 


time f. 


In order to give a qualitative idea of the distances obtained we also provide a 
graphical representation of the copula level curves for the pair Nasdaq and S&P at 


time t = 1. 


We observe that the distance in Table 3 is very low and decreasing in time. The plot 
(and similar, unreported ones, for other couples and tenors) reinforces the conclusion. 
Therefore the Gaussian copula seems to be a good approximation of the true copula, 


at least for long horizons. 
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Table 3. Distances between the Gaussian and empirical copula for calibrated a-VG price 
processes, over different stock indices and over time t (expressed in years) 


t 


Pair 0.1 1 10 100 

S&P/Nasdaq 0.015 0.0098 0.0098 0.0097 
S&P/CAC 40 0.022 0.0098 0.0097 0.0097 
S&P/FTSE 0.0101 0.0085 0.0085 0.0085 
S&P/Nikkei 0.037 0.0094 0.0091 0.0089 
S&P/DAX 0.034 0.0092 0.0088 0.0087 
S&P/Hang Seng 0.011 0.0083 0.0084 0.0084 
Nasdaq/CAC 40 0.020 0.0095 0.0095 0.0094 
Nasdaq/FTSE 0.010 0.0079 0.0079 0.0079 
Nasdaq/Nikkei 0.0263 0.0088 0.0085 0.0083 
Nasdaq/DAX 0.035 0.0092 0.0088 0.0087 
Nasdaq/Hang Seng 0.010 0.0078 0.0079 0.0079 
CAC 40/FTSE 0.010 0.0079 0.0079 0.0079 
CAC 40/Nikkei 0.0261 0.0088 0.0085 0.0085 
CAC 40/DAX 0.0273 0.0088 0.0085 0.0083 
CAC 40/Hang Seng 0.010 0.0078 0.0079 0.0079 
FTSE/Nikkei 0.0170 0.0078 0.0074 0.0072 
FTSE/DAX 0.0165 0.0077 0.0072 0.0071 
FTSE/Hang Seng 0.0097 0.0098 0.0098 0.0098 
Nikkei/DAX 0.0201 0.0078 0.0074 0.0073 
Nikkei/Hang Seng 0.012 0.0071 0.0071 0.0071 
DAX/Hang Seng 0.0115 0.0069 0.0069 0.0069 


3.4 Measures of dependence 


In order to confirm our results we also compare two non-linear dependence measures 
obtained simulating the copula with the corresponding ones of the Gaussian copula. 

For tf = 0.1, 1, 10, 100 we computed the simulated values of Spearman’s rho, 
ps(t), and Kendall’s tau, 7 (t), obtained from the empirical copulas. The methodology 
is described in Appendix A. 

We found the analytical values of the Gaussian copula corresponding to each pair, 
by means of the relationships: 

_ p Bynes 
ps = —arcsin=; T= —arcsinp. (12) 
1 2 1 

The results obtained are consistent with respect to the copula distances, as ex- 
pected. They confirm the “tendency” towards Gaussian dependence as ¢ increases. 
We report below the results for the first index pair, namely S & P-Nasdaq. The others 
behave in a similar way. 
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Fig. 1. Level curves of the Gaussian (Gaus) and empirical (emp) copula of the a-VG calibrated 
price processes, S & P - Nasdaq, after one year 


Table 4. Simulated values of fs (¢) and 7 (t) for the numerical (a ) and Gaussian copula (Gauss) 
over different horizons. S & P/Nasdaq pair 


Pair Co1 Ci Cio C00 Gauss 
S&P /Nasdaq ps 0.74 0.78 0.79 0.79 0.79 
T 0.54 0.59 0.58 0.59 0.59 


4 Conclusions and further research 


This paper measures the non-linear dependence of the a-VG process, calibrated to 
a set of stock market data, by means of a distance between its empirical copula at 
time ¢ and the corresponding Gaussian one, which is characterised by the (constant) 
correlation coefficient of the process. 

Our empirical analysis suggests that non-linear dependence is “decreasing” in 
time, since the approximation given by the Gaussian copula improves in time. As 
expected, non-linear dependence coefficients confirm the result. The tentative con- 
clusion is that, similarly to marginal non-Gaussianity, which is usually stronger on 
short-horizon than on long-horizon returns, joint non-linear dependence and non- 
Gaussianity fade over time. 

This represents an important point of departure for practical, large-scale imple- 
mentations of the a-VG model and of its subcase, the traditional VG model. Any 
multivariate derivative price or portfolio risk measure indeed is based on the joint 
distribution of returns. If we use a time-varying empirical copula in order to re-assess 
prices and risk measures over time, and we want the results to be reliable, exten- 
sive and time-consuming simulations are needed. If, on the contrary, we can safely 
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approximate the actual copula with the Gaussian one, at least over long horizons, 
life becomes much easier. Closed or semi-closed formulas exist for pricing and risk 
measurement in the presence of the Gaussian copula (see for instance [1]). Standard 
linear correlation can be used for the model joint calibration. 

In a nutshell, one can adopt an accurate, non-Gaussian model and safely ignore 
non-linear (and non-analytical) dependence, in favour of the linear dependence rep- 
resented by the familiar Gaussian copula, provided the horizon is long enough. In 
the stock market case analysed here, one year was quite sufficient for non-Gaussian 
dependence to be ignored. 


Appendix 
Simulated measure of dependence 


The simulated version of Spearman’s rho at time f, ps(t), can be obtained from a 
sample of N realisations of the processes at time f (xj (t), x5(¢)), i= 1,...,N: 


pein (Ri — 8)? 


N(N2-1) ” oe 


ps(t) =1—- 


where R; = Rank(x' (t)) and S; = Rank(x‘, (t)). Similarly for Kendall’s tau, tc (t): 


—d 


ic(t) = raat 
(3) 


where c is the number of concordance pairs of the sample and d the number of discor- 
dant ones. A pair (xj (t), x5(t)) is said to be discordant [concordant] if x} (1)x5(t) < 0 
[x} (t)x5(t) = 0]. The N realisations of the process are obtained as follows: 


(14) 


e Simulate N realisations from the independent laws L(Y), £(Y2), £(Z); let them 
be respectively y/, y;, 2” forn =1,..., N. 

e Obtain N realisations (g/, 95) of G through the relations G] = Y; + Z and 
G2 =Y2+4+ Z. 

e Generate N independent random draws from each of the independent random 
variables M, and M2 with laws N(0O, G1) and N(O, G2). The draws for M, in 
turn are obtained from N normal distributions with zero mean and variance gi» 
namely 

M\(n) = N(O, g}). 


The draws for M> are from normal distributions with zero mean and variance 85> 
namely 
M2(n) = N(O, g3). 


Multivariate Variance Gamma and Gaussian dependence: a study with copulas 203 


Obtain N realisations (x}, x5) of X(1) by means of the relations 


xT = Mig) +a1Mi(n) 
X45 = 1285 +02M2(n). 
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A simple dimension reduction procedure for corporate 
finance composite indicators* 


Marco Marozzi and Luigi Santamaria 


Abstract. Financial ratios provide useful quantitative financial information to both investors 
and analysts so that they can rate a company. Many financial indicators from accounting books 
are taken into account. Instead of sequentially examining each ratio, one can analyse together 
different combinations of ratios in order to simultaneously take into account different aspects. 
This may be done by computing a composite indicator. The focus of the paper is on reducing the 
dimension of a composite indicator. A quick and compact solution is proposed, and a practical 
application to corporate finance is presented. In particular, the liquidity issue is addressed. The 
results suggest that analysts should take our method into consideration as it is much simpler 
than other dimension reduction methods such as principal component or factor analysis and is 
therefore much easier to be used in practice by non-statisticians (as financial analysts generally 
are). Moreover, the proposed method is always readily comprehended and requires milder 
assumptions. 


Key words: dimension reduction, composite indicator, financial ratios, liquidity 


1 Introduction 


Financial ratios provide useful quantitative financial information to both investors and 
analysts so that they can rate a company. Many financial indicators from accounting 
books are taken into account. In general, ratios measuring profitability, liquidity, 
solvency and efficiency are considered. 

Instead of sequentially examining each ratio, one can analyse different combina- 
tions of ratios together in order to simultaneously take into account different aspects. 
This can be done by computing a composite indicator. 

Complex variables can be measured by means of composite indicators. The basic 
idea is to break down a complex variable into components which are measurable 
by means of simple (partial) indicators. The partial indicators are then combined to 
obtain the composite indicator. To this end one should 


* The paper has been written by and the proposed methods are due to M. Marozzi. L. Santa- 
maria gave helpful comments to present the application results. 
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e possibly transform the original data into comparable data through a proper func- 
tion T (-) and obtain the partial indicators; 

e combine the partial indicators to obtain the composite indicator through a proper 
link (combining) function f(-). 


If X1,..., Xx are the measurable components of the complex variable, then the 
composite indicator is defined as 


M = f(T(X1),..., Tk (Xx)). (1) 


Fayers and Hand [3] report extensive literature on the practical application of com- 
posite indicators (the authors call them multi-item measurement scales). In practice, 
the simple weighted or unweighted summations are generally used as combining 
functions. See Aiello and Attanasio [1] for a review on the most commonly used data 
transformations to construct simple indicators. 

The purpose of this paper is to reduce the dimensions of a composite indicator 
for the easier practice of financial analysts. In the second section, we discuss how to 
construct a composite indicator. A simple method to simplify a composite indicator 
is presented in Section 3. A practical application to the listed company liquidity issue 
is discussed in Section 4. Section 5 concludes with some remarks. 


2 Composite indicator computation 


Let Xj, denote the kth financial ratio (partial component), k = 1,..., K, for the ith 
company, i = 1,..., N. Let us suppose, without loss of generality, that the partial 
components are positively correlated to the complex variable. To compute a composite 
indicator, first of all one should transform the original data into comparable data in 
order to obtain the partial indicators. Let us consider linear transformations. A linear 
transformation LT changes the origin and scale of the data, but does not change the 
shape 

LT(Xix) =a+bXix, a €] — 0, +00[, b > 0. (2) 


Linear transformations allow us to maintain the same ratio between observations (they 
are proportional transformations). 

The four linear transformations most used in practice are briefly presented [4]. 
The first two linear transformations are defined as 


Tope = (3) 
max; (Xix) 
and . ini (X 
ik — min; (Xj 
TOPE 6 cs een a (4) 
max; (Xix) — min; (Xix) 
which correspond to LT where a = 0 and b = mana)’ and where a = 
—minj (Xix) = 1 ei 
TETAS OUEST OC) and b = AD respectively. LT; and L7> cancel 
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the measurement units and force the results into a short and well defined range: 
mini < LT\(Xix) < 1 andO < LT)(Xjx) < 1 respectively. LT; and LT> are 
readily comprehended. 


The third and fourth linear transformations are defined as 


Xizg — E(Xx 
LT3(Xix) = “a (5) 
and 
Xik — MED(Xx) 
LTs(Xix) = WADED” (6) 
which correspond to LT where a = Set and b = SDD? and where aq = 
Sarat and b = WADED , respectively. L T3(X jx) indicates how far X;x lies from 


the mean E(X;) in terms of the standard deviation S D(Xx). LT4 is similar to LT3 and 
uses the median M E D(X;) instead of the mean as location measure and the median 
absolute deviation MAD(X;,) instead of the standard deviation as scale measure. 
By means of LT) (where the subscript h=1, 2, 3 or 4 denotes the various methods) 
the original data are transformed into comparable data. The composite indicator is 
then defined using the sum as the combining function, in accordance with general 


practice (see [1]) 
K 


Mi = SILT (Kix), h = 1,2, 3,4. (7) 
k=1 
My, iS are used to rank the units. Note that the first and second method may be ap- 
plied also to ordered categorical variables, or to mixed variables, partly quantitative 
and partly ordered categorical, with the unique concern of how to score the ordered 
categories. 

In Section 4 we analyse a data set about listed companies, in particular we con- 
sider four different liquidity ratios. For the ith company we denote these ratios by 
Xi1, Xi2, Xi3, Xi4. Note that T(X;j1), T(Xi2), T (Xi3), T (Xi) are partial financial 
indicators since they correspond to a unique financial ratio: T(Xjx) > T (X jx) lets 
the analyst conclude that company i is better than company j for what concerns fi- 
nancial ratio X; (since of course T(Xjx) > T(X jx) & Xix > X jx), whereas M; 
is a composite financial indicator since it simultaneously considers every financial 
ratio. M,,..., My allow the analyst to rank the companies since M; > Mj; means 
that company i is better than company j regarding all the financial ratios together. 
There is reason to believe that financial ratios are correlated. This central question 
is addressed in the next section: a simple method for reducing the number of partial 
indicators underlying a composite indicator is proposed. 

It is important to emphasise that in this paper we do not consider composite 
indicators based on non-linear transformations since Arboretti and Marozzi (2005) 
showed that such composite indicators perform better than those based on linear 
transformations only when distributions of X1,..., Xx parent populations are very 
heavy-tailed. Preliminary analyses on our data show that parent distributions are 
not heavy-tailed. Composite indicators based on non-linear transformations may be 
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based for example on the rank transformation of X;,s or on Lago and Pesarin’s [6] 
Nonparametric Combination of Dependent Rankings. For details on this matter see 
[7] and [8]. 


3 To reduce the dimension of composite indicators 


Let Re (Xx,k € {1,..., K}) = Rex denote the vector of ranks obtained following 
the composite financial indicator 


K 
Xik —- MED(Xx 
ey eee 


; 8 

MAD(Xx) (8) 
k=1 

computed fori = 1,...,N, which combines all the partial financial indicators 

X1,..., XK. We consider the fourth method because the median absolute devia- 


tion is the most useful ancillary estimate of scale [5, p. 107]. Suppose now that Xj, is 
excluded from the analysis. Let ;Ry_,(Xx,k € {1,..., K} —h) =n Rg_, denote 
the corresponding rank vector. If Rx and ;»Rx_, are very similar, it follows that the 
exclusion of X», does not affect the ranking of the companies much. On the contrary, 
if the two rank vectors are very different, by leaving out X, the ranking process 
is greatly influenced. To estimate the importance of X;, we compute the Spearman 
correlation coefficient between Rx and ;~>Re_ 


6 Dini Re lil hn Rel)” 

N(N2 — 1) ; 
where Ry[i] and ;»Rx_ [i] are the ith element of the corresponding vector. The 
closer s is to 1, the less important X;, is. The idea is to leave out the partial indicator 
X;, that brings the greatest s(Rx.n Rx _,). The procedure may be repeated for the 
K —2 rankings obtained by leaving out one more partial indicator. Let X; be the next 
indicator that is excluded from the ranking process. We compute ;,, Rx _7(Xx,k € 
{1,...,K} — {L,h}) = pn Re_ and sQRe_ 11,2 Re_>) for? =1,...,K, 1 Ah. 
The partial indicator that brings the greatest s should be excluded, and so on. 

Even if the whole procedure naturally lasts until only one partial indicator is left to 
be used by financial analysts, a natural question arises: when should the partial indi- 
cator exclusion procedure be stopped? That is, how many partial financial indicators 
should be excluded? Within this framework, it is assumed that the best ranking is the 
one based on all the partial indicators. Of course, there is a trade-off between infor- 
mation and variable number reduction. A natural stopping rule is: stop the procedure 
as soon as the correlation coefficient is less than a fixed value. 


S(Rewn Re) = 1- (9) 


4 A practical application 


We present an application of the procedure for reducing the dimension of corporate 
finance composite indicators. More precisely, the liquidity issue is considered. The 
aim is to rate a set of companies on the basis of the following liquidity ratios. 
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The current ratio 
total current assets 


(10) 


total current liabilities’ 
indicates the company’s ability to meet short-term debt obligations; the higher the 
ratio, the more liquid the company is. If the current assets of a company are more 
than twice the current liabilities, then that company is generally considered to have 
good short-term financial strength. If current liabilities exceed current assets, then the 
company may have problems meeting its short-term obligations. 
The quick ratio 


total current assets — inventory 


(1) 


om total current liabilities 
is a measure of a company’s liquidity and ability to meet its obligations. It expresses 
the true working capital relationship of its cash, accounts receivables, prepaids and 
notes receivables available to meet the company’s current obligations. The higher the 
ratio, the more financially strong the company is: a quick ratio of 2 means that for 
every euro of current liabilities there are two euros of easily convertible assets. 

The interest coverage ratio 


earnings before interest and taxes 
3 = ee. (12) 
interest expenses 
The lower the interest coverage ratio, the larger the debt burden is on the company. It 
is a measure of a company ability to meet its interest payments on outstanding debt. 
A company that sustains earnings well above its interest requirements is in a good 
position to weather possible financial storms. 
The cash flow to interest expense ratio 


acon cash flow (13) 
interest expenses 
The meaning is clear: a cash flow to interest expense ratio of 2 means that the company 
had enough cash flow to cover its interest expenses two times over in a year. 

These ratios are important in measuring the ability of a company to meet both 
its short-term and long-term obligations. To address company liquidity, one may 
sequentially examine each ratio that addresses the problem from a particular (partial) 
point of view. For example, the current ratio as well as the quick ratio are regarded as 
a test of liquidity for a company, but while the first one expresses the working capital 
relationship of current assets available to meet the company’s current obligations, the 
second one expresses the true working capital relationship of current assets available 
to meet current obligations since it eliminates inventory from current assets. This 
is particularly important when a company is carrying heavy inventory as part of 
its current assets, which might be obsolete. However, it should be noted that in the 
literature the order of their importance is not clear. For more details see for example [9]. 

A dataset about 338 companies listed on the main European equity markets has 
been analysed. We consider listed companies because they have to periodically send 
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out financial information following standard rules. First 


M; = 


4 
Xi — MED(X 
ee =e ee GoR (14) 


<= — MAD(Xx) : 
is computed. This is a composite indicator of liquidity for company i which takes 
into account simultaneously the partial liquidity indicators X1, X2, X3, X4. Then, 
the corresponding rank vector Ry(X1, X2, X3, X4) = Ry is computed. This vector 
has been compared to the vectors corresponding to the consideration of three partial 
indicators: 4R3(X1, X2, X3) = 4R3, 3R3(X1, X2, X4) = 3R3, 2R3(X1, X3, X4) = 
2R3 and 1 R3(X2, X3, X4) = 13, through the Spearman correlation coefficient. In 
the first step of the procedure the quick ratio X2 left the analysis since we have 


s(R4,4 R3) = 0.9664, s(Ry,3 R3) = 0.9107, 
s(R4,2 R3) = 0.9667, s(Ry,1 R3) = 0.9600. 


In the second step, we compare 2R3(X1, X3, X4) = 2R3 with 42R(X1, X3) = 
4,2R5,3,2Ro(X1, X4) = 3,2R, and 1,2R,(X3, X4) = 1,25. The cash flow to interest 
expense ratio X4 left the analysis since we have 


8(2.R3,4,2 Ro) = 0.956,  8(2.R3,3,2 Ro) = 0.909, 8(2R3,1,2 Ro) = 0.905. 


In the last step, the current ratio Xj left the analysis since it is s(4,2.R5,3,4,2 Ry) = 
0.672 and s(4.2 R>,1,4,2 Ry) = 0.822. 

We conclude that the ranking obtained by considering together X;, X2, X3, X4 
is similar to that based on the interest coverage ratio X3, and then the analyst is sug- 
gested to focus on X3 in addressing the liquidity issue of the companies. Our method 
reduces the information included in the original data by dropping the relatively unim- 
portant financial data. These dropped financial data, however, might have important 
information in comparing a certain set of companies. For example, the quick ratio X2 
has been excluded in the first step of the procedure, and then the inventory has not 
become an aspect for the analyst to decide whether to invest in a company or not. But 
depending on Rees [9, p. 195], the market reaction to earnings disclosure of small 
firms is great. If the inventory becomes large, the smaller firms might go bankrupt 
because they cannot stand its cost, whereas the larger firms endure it. Moreover there 
may be a lot of seasonality effect on sales because monthly sales may differ greatly. 
This affects small firms deeply; in fact many studies have suggested that the bulk of 
the small firm effect is concentrated in certain months of the year [9, p. 180]. There- 
fore it might not be possible to apply the results of this paper to smaller firms without 
taking into account the inventory issue. Moreover, the importance of the financial data 
available differs among industries. For example, the importance of inventory might 
be different between the manufacturing industry and the financial industry. In general, 
the importance of financial data may vary between the comparison among the whole 
set of companies and the selected set of certain companies. For financial analysts the 
comparison should be done to selected sets of companies. The analysis of variance 
can help in evaluating the bias generated from this method. 
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To evaluate if our result depends on the company capitalisation, we divide the 
companies into two groups: the 194 companies with a capitalisation less than EUR 
five billion and the remaining 146 with a capitalisation greater than EUR five billion. 
We adopted the same criterion used by the anonymous financial firm that gave us the 
data. For the “small cap” companies we obtained the following results 


s/(Ry,4 Ry) = 0.954, s/(Ry,3 Ry) = 0.903, s/(Ry,2 Rx) = 0.969, s!(Ryy1 Ry) = 0.953: 
S'(2R3,4,2 Ro) = 0.944, s’(.R3,3,2 Ry) = 0.909,  8"(2.R3,1,2 Ro) = 0.907; 
5’ (2,4R5,3,2,4 Ry) = 0.689, 8’(2,4.Ro,1,2,4 Ry) = 0.817; 


therefore the first liquidity ratio that is excluded is the quick ratio X2, the second 
is the cash flow to interest expense ratio X4 and finally the current ratio X,. The 
procedure suggests focusing on the interest coverage ratio X3 when ranking the small 
cap companies. 

For the “large cap” companies we obtained the following results 


s"(Ry,4R3) =0.977, s"(Ry,3.R3) = 0.931, 8"(Ry,2 Bz) = 0.974, s"(Ry,1 Ry) = 0.967; 
8" (4R3,3,4.Ro) = 0.781,  s”(4R3,2,4 Ro) = 0.967,  8"(4R3,1,4. Ro) = 0.959; 
5" (2,4R5,3,2,4 R,) = 0.698, s”"(2,4Ry,1,2,4.R1) = 0.808. 


These results are again similar to those obtained before, both for all the companies 
and for the small cap ones. The conclusion is that the dimension reduction procedure 
is not much affected by the fact that a company is a large cap one or a small cap 
one. It should be cautioned that this result (as well as the other ones) applies only 
to the data set that has been considered in the paper, but the analysis may be easily 
applied to other data sets or to other financial ratios (efficiency, profitability, ...). 
Moreover, attention should be paid to the industry sector the companies belong to. 
For example, as we have already noted, the role of the inventory might be different 
between the manufacturing industry and the financial industry. Therefore we suggest 
financial analysts to group the companies on the basis of the industry sector before 
applying the reduction procedure. This question is not addressed here and requires 
further research. 

The data have been reanalysed through principal component analysis, which is the 
most used dimension reduction method. Principal component analysis suggests that 
there are two principal components, the first explains 62.9% and the second 30.2% 
of the variance. The first component is a weighted mean of the liquidity ratios with 
similar weights so that it may be seen as a sort of generic indicator for company 
liquidity. The loadings on component one are 0.476 for X1, 0.488 for X2, 0.480 for 
X3 and 0.552 for X4. The loadings on component two are respectively 0.519, 0.502, 
—0.565 and —0.399. Note that the loadings are positive for X 1 and X2, which compare 
assets with liabilities, while they are negative for X3 and X4, which are measures of 
company ability to meet its interest payments on outstanding debt. The correlation 
between the ranking based on X3 and that based on the first principal component 
is 0.936. Therefore the rankings are very similar, but the method proposed in this 
paper is simpler to understand and be employed by financial analysts, who do not 
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do not usually have a strong background in statistics. From the practical point of 
view, our method is more natural since it imitates what many analysts implicitly do in 
practice by focusing on the most important aspects, discarding the remaining ones. It 
is always readily comprehended, while principal components are often quite difficult 
to be actually interpreted. From the theoretical point of view, a unique and very mild 
assumption should be fulfilled for using our method: that financial ratios follow the 
larger the better rule. We do not have to assume other hypotheses, that on the contrary 
should be generally assumed by other dimension reduction methods such as principal 
component (think for example about the hypothesis of linearity) or factor analysis. 
Moreover, it is important to emphasise that, if one considers the first or second linear 
transformation method, the composite indicator simplifying procedure may be applied 
also to ordered categorical variables, or to mixed ones, partly quantitative and partly 
ordered categorical, with the unique concern of how to score the ordered categories. 


5 Conclusions 


When a financial analyst rates a company, many financial ratios from its accounting 
books are considered. By computing a composite indicator the analyst can analyse 
different combinations of ratios together instead of sequentially considering each ratio 
independently from the other ones. This is very important since ratios are generally 
correlated. A quick and compact procedure for reducing the number of ratios at the 
basis of a composite financial indicator has been proposed. A practical application to 
the liquidity issue has been discussed. We ranked a set of listed companies by means 
of composite indicators that considered the following liquidity ratios: the current 
ratio, the quick ratio, the interest coverage ratio and the cash flow to interest expense 
ratio. The results suggest that analysts should focus on the interest coverage ratio in 
addressing the liquidity issue of the companies. By applying also principal component 
analysis to the data at hand we showed that our dimension reduction method should be 
preferred because it is always readily comprehended and much simpler. Moreover it 
requires a unique and very mild assumption: that financial ratios follow the larger the 
better rule. However, financial analysts should pay attention to the industry sector the 
companies belong to. We suggest that financial analysts should group the companies 
on the basis of the industry sector before applying our reduction procedure. 
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The relation between implied and realised volatility in 
the DAX index options market 


Silvia Muzzioli 


Abstract. The aim of this paper is to investigate the relation between implied volatility, histor- 
ical volatility and realised volatility in the DAX index options market. Since implied volatility 
varies across option type (call versus put) we run a horse race of different implied volatility 
estimates: implied call and implied put. Two hypotheses are tested in the DAX index options 
market: unbiasedness and efficiency of the different volatility forecasts. Our results suggest 
that both implied volatility forecasts are unbiased (after a constant adjustment) and efficient 
forecasts of future realised volatility in that they subsume all the information contained in 
historical volatility. 


Key words: volatility forecasting, Black-Scholes Implied volatility, put-call parity 


1 Introduction 


Volatility is a key variable in option pricing models and risk management techniques 
and has drawn the attention of many theoretical and empirical studies aimed at assess- 
ing the best way to forecast it. Among the various models proposed in the literature in 
order to forecast volatility, we distinguish between option-based volatility forecasts 
and time series volatility models. The former models use prices of traded options 
in order to unlock volatility expectations while the latter models use historical in- 
formation in order to predict future volatility (following [17], in this set we group 
predictions based on past standard deviation, ARCH conditional volatility models 
and stochastic volatility models). Many empirical studies have tested the forecasting 
power of implied volatility versus a time series volatility model. 

Some early contributions find evidence that implied volatility (IV) is a biased 
and inefficient forecast of future realised volatility (see e.g., [2,6, 14]). Although the 
results of some of these studies (e.g., [6, 14]) are affected by overlapping samples, as 
recalled by [4], or mismatching maturities between the option and the volatility fore- 
cast horizon, they constitute early evidence against the unbiasedness and information 
efficiency of IV. More recently, several papers analyse the empirical performance of 
IV in various option markets, ranging from indexes, futures or individual stocks and 
find that IV is unbiased and an efficient forecast of future realised volatility. In the 


M. Corazza et al. (eds.), Mathematical and Statistical Methods for Actuarial Sciences and Finance 
© Springer-Verlag Italia 2010 


216 S. Muzzioli 


index options market, Christensen and Prabhala [5] examine the relation between IV 
and realised volatility using S&P100 options, over the time period 1983-1995. They 
find that IV is a good predictor of future realised volatility. Christensen et al. [4] use 
options on the S&P100 and non-overlapping samples and find evidence for the effi- 
ciency of IV as a predictor of future realised volatility. In the futures options market 
Ederington and Guan [8] analyse the S&P500 futures options market and find that 
IV is an efficient forecast of future realised volatility. Szakmary et al. [19] consider 
options on 35 different futures contracts on a variety of asset classes. They find that 
IV, while not a completely unbiased estimate of future realised volatility, has more 
informative power than past realised volatility. In the stock options market, Godbey 
and Mahar [10] analyse the information content of call and put IV extracted from 
options on 460 stocks that compose the S&P500 index. They find that IV contains 
some information on future realised volatility that is superior both to past realised 
volatility and to a GARCH(1,1) estimate. 

Option IV differs depending on strike price of the option (the so called smile 
effect), time to maturity of the option (term structure of volatility) and option type 
(call versus put). As a consequence, in the literature there is an open debate about 
which option class is most representative of market volatility expectations. As for the 
moneyness dimension, most of the studies use at the money options (or close to the 
money options) since they are the most heavily traded and thus the most liquid. As 
for the time to maturity dimension, the majority of the studies use options with time 
to maturity of one month in order to make it equal to the sampling frequency and the 
estimation horizon of realised volatility. As for the option type, call options are more 
used than put options. As far as we know, there is little evidence about the different 
information content of call or put prices. Even if, theoretically, call and put are linked 
through the put-call parity relation, empirically, given that option prices are observed 
with measurement errors (stemming from finite quote precision, bid-ask spreads, non- 
synchronous observations and other measurement errors), small errors in any of the 
input may produce large errors in the output (see e.g., [12]) and thus call IV and put IV 
may be different. Moreover, given that put options are frequently bought for portfolio 
insurance, there is a substantial demand for puts that is not available for the same 
call options. Also, in [15] we have proved that the use of both call and put options 
improves the pricing performance of option implied trees, suggesting that call and 
put may provide different information. Fleming [9] investigates the implied-realised 
volatility relation in the S&P100 options market and finds that call IV has slightly 
more predictive power than put IV. In the same market, Christensen and Hansen [3] 
find that both call and put IV are informative of future realized volatility, even if call 
IV performs slightly better than put IV. Both studies use American options and need 
the estimation of the dividend yield. These two aspects influence call and put options 
in a different manner and may alter the comparison if not properly addressed. 

The aim of the paper is to explore the relation between call IV, put IV, historical 
volatility and realised volatility in the DAX index option market. The market is chosen 
for two main reasons: (i) the options are European, therefore the estimation of the 
early exercise premium is not needed and cannot influence the results; (ii) the DAX 
index is a capital weighted performance index composed of 30 major German stocks 
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and is adjusted for dividends, stocks splits and changes in capital. Since dividends 
are assumed to be reinvested into the shares, they do not affect the index value. The 
plan of the paper is the following. In Section 2 we illustrate the data set used, the 
sampling procedure and the definition of the variables. In Section 3 we describe 
the methodology used in order to address the unbiasedeness and efficiency of the 
different volatility forecasts. In Section 4 we report the results of the univariate and 
encompassing regressions and we test our methodology for robustness. Section 5 
concludes. 


2 The data set, the sampling procedure and the definition of the 
variables 


Our data set consists of daily closing prices of at the money call and put options on 
the DAX index, with one-month maturity recorded from 19 July 1999 to 6 December 
2006. The data source is DATASTREAM. Each record reports the strike price, expi- 
ration month, transaction price and total trading volume of the day separately for call 
and put prices. We have a total of 1928 observations. As for the underlying we use 
the DAX index closing prices recorded in the same time period. As a proxy for the 
risk-free rate we use the one-month Euribor rate. DAX options are European options 
on the DAX index, which is a capital weighted performance index composed of 30 
major German stocks and is adjusted for dividends, stock splits and changes in capital. 
Since dividends are assumed to be reinvested into the shares, they do not affect the 
index value, therefore we do not have to estimate the dividend payments. Moreover, 
as we deal with European options, we do not need the estimation of the early exercise 
premium. This latter feature is very important since our data set is by construction 
less prone to estimation errors if compared to the majority of previous studies that 
use American-style options. The difference between European and American options 
lies in the early exercise feature. The Black-Scholes formula, which is usually used 
in order to compute IV, prices only European-style options. For American options 
adjustments have to be made: for example, Barone-Adesi and Whaley [1] suggest a 
valuation formula based on the decomposition of the American option into the sum of 
a European option and a quasi-analytically estimated early exercise premium. How- 
ever, given the difficulty in implementing the Barone-Adesi and Whaley model, many 
papers (see e.g., [5]) use the Black and Scholes formula also for American options. 
Given that American option prices are generally higher than European ones, the use 
of the Black-Scholes formula will generate an IV that overstates the true IV. 

In order to avoid measurement errors, the data set has been filtered according to 
the following filtering constraints. First, in order not to use stale quotes, we elimi- 
nate dates with trading volume less than ten contracts. Second, we eliminate dates 
with option prices violating the standard no arbitrage bounds. After the application 
of the filters, we are left with 1860 observations out of 1928. As for the sampling 
procedure, in order to avoid the telescoping problem described in [4], we use monthly 
non-overlapping samples. In particular, we collect the prices recorded on the Wednes- 
day that immediately follows the expiry of the option (third Saturday of the expiry 
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month) since the week immediately following the expiration date is one of the most 
active. These options have a fixed maturity of almost one month (from 17 to 22 days 
to expiration). If the Wednesday is not a trading day we move to the trading day im- 
mediately following. The IV, provided by DATASTREAM,, is obtained by inverting 
the Black and Scholes formula as a weighted average of the two options closest to 
being at the money and is computed for call options (o,) and for put options (ap). IV 
is an ex-ante forecast of future realised volatility in the time period until the option 
expiration. Therefore we compute the realised volatility (¢,) in month f as the sample 
standard deviation of the daily index returns over the option’s remaining life: 


where R; is the return of the DAX index on day i and R is the mean return of the DAX 
index in month t. We annualise the standard deviation by multiplying it by /252. 

In order to examine the predictive power of IV versus a time series volatility 
model, following prior research (see e.g., [5, 13]), we choose to use the lagged (one 
month before) realised volatility as a proxy for historical volatility (a7). Descriptive 
statistics for volatility and log volatility series are reported in Table 1. We can see 
that on average realised volatility is lower than both IV estimates, with call IV being 
slightly higher than put IV. As for the standard deviation, realised volatility is slightly 
higher than both IV estimates. The volatility series are highly skewed (long right 
tail) and leptokurtic. In line with the literature (see e.g., [13]) we decided to use the 
natural logarithm of the volatility series instead of the volatility itself in the empirical 
analysis for the following reasons: (i) log-volatility series conform more closely to 
normality than pure volatility series: this is documented in various papers and it is the 
case in our sample (see Table 1); (ii) natural logarithms are less likely to be affected 
by outliers in the regression analysis. 


Table 1. Descriptive statistics 


Statistic Oc Op Or Inde Inop Ino; 
Mean 0.2404 0.2395 0.2279 —1.51 —1.52 —1.6 
Std dev 0.11 0.11 0.12 0.41 0.41 0.49 
Skewness 1.43 1.31 1.36 0.49 0.4 0.41 
Kurtosis 4.77 4.21 4.37 2.73 2,71 2.46 
Jarque Bera 41.11 30.28 33.68 3.69 2.68 3.54 


p-value 0.00 0.00 0.00 0.16 0.26 0.17 
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3 The methodology 


The information content of IV is examined both in univariate and in encompassing 
regressions. In univariate regressions, realised volatility is regressed against one of 
the three volatility forecasts (call IV (o,), put IV (ap), historical volatility (o,)) in 
order to examine the predictive power of each volatility estimator. The univariate 
regressions are the following: 


In(o-) = a+ Bln(o;), (1) 


where o; is realised volatility and 0; is volatility forecast, i = h, c, p. Inencompassing 
regressions, realised volatility is regressed against two or more volatility forecasts in 
order to distinguish which one has the highest explanatory power. We choose to 
compare pairwise one IV forecast (call, put) with historical volatility in order to see if 
IV subsumes all the information contained in historical volatility. The encompassing 
regressions used are the following: 


In(o-) = a+ f In(o;) + y In(on), (2) 


where o; is realised volatility, o; is implied volatility, i = c, p and oy is historical 
volatility. Moreover, we compare call IV and put IV in order to understand if the 
information carried by call (put) prices is more valuable than the information carried 
by put (call) prices: 

In(@o,) =a+ fh ln(op)+ y Ino), (3) 


where o; is realised volatility, o, is call IV and co, is put IV. 

Following [4], we tested three hypotheses in the univariate regressions (2). The 
first hypothesis concerns the amount of information about future realised volatility 
contained in the volatility forecast. If the volatility forecast contains some information, 
then the slope coefficient should be different from zero. Therefore we test if # = O and 
we see whether it can be rejected. The second hypothesis is about the unbiasedness 
of the volatility forecast. If the volatility forecast is an unbiased estimator of future 
realised volatility, then the intercept should be zero and the slope coefficient should 
be one (Ho: a = 0 and f = 1). Incase this latter hypothesis is rejected, we see if at 
least the slope coefficient is equal to one (Ho: 8 = 1) and, if not rejected, we interpret 
the volatility forecast as unbiased after a constant adjustment. Finally if IV is efficient 
then the error term should be white noise and uncorrelated with the information 
set. In encompassing regressions there are three hypotheses to be tested. The first is 
about the efficiency of the volatility forecast: we test whether the volatility forecast 
(call IV, put IV) subsumes all the information contained in historical volatility. In 
affirmative case the slope coefficient of historical volatility should be equal to zero, 
(Ho: y = 0). Moreover, as a joint test of information content and efficiency we test 
if the slope coefficients of historical volatility and IV (call, put) are equal to zero and 
one respectively (Ho: 6 = 1 and y = 0). Following [13], we ignore the intercept in 
the latter null hypothesis, and if our null hypothesis is not rejected, we interpret the 
volatility forecast as unbiased after a constant adjustment. Finally we investigate the 
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different information content of call TV and put IV. To this end we test, in augmented 
regression (4), if y = 0 and f = 1, in order to see if put IV subsumes all the 
information contained in call IV. 

In contrast to other papers (see e.g., [3,5]) that use American options on dividend 
paying indexes, our data set of European-style options on a non-dividend paying 
index is free of measurement errors that may arise in the estimation of the dividend 
yield and the early exercise premium. Nonetheless, as we are using closing prices for 
the index and the option that are non-synchronous (15 minutes’ difference) and we 
are ignoring bid ask spreads, some measurement errors may still affect our estimates. 
Therefore we adopt an instrumental variable procedure, we regress call (put) TV on an 
instrument (in univariate regressions) and on an instrument and any other exogenous 
variable (in encompassing and augmented regressions) and replace fitted values in the 
original univariate and encompassing regressions. As the instrument for call (put) IV 
we use both historical volatility and past call (put) IV as they are possibly correlated 
to the true call (put) IV, but unrelated to the measurement error associated with call 
(put) IV one month later. As an indicator of the presence of errors in variables we 
use the Hausman [11] specification test statistic. The Hausman specification test is 
defined as: TAR gos where: Br sis is the beta obtained through the 
Two Stages Least Squares procedure, fo_s is the beta obtained through the Ordinary 
Least Squares (OLS) procedure and Var(x) is the variance of the coefficient x. The 
Hausman specification test is distributed as a y7(1). 


4 The results 


The results of the OLS univariate (equation (2)), encompassing (equation (3)), and 
augmented (equation (4)) regressions are reported in Table 2. In all the regressions 
the residuals are normal, homoscedastic and not autocorrelated (the Durbin Watson 
statistic is not significantly different from two and the Breusch-Godfrey LM test 
confirms no autocorrelation up to lag 12). First of all, in the three univariate regressions 
all the beta coefficients are significantly different from zero: this means that all three 
volatility forecasts (call IV, put IV and historical) contain some information about 
future realised volatility. However, the null hypothesis that any of the three volatility 
forecasts is unbiased is strongly rejected in all cases. In particular, in our sample, 
realised volatility is on average a little lower than the two IV forecasts, suggesting that 
IV overpredicts realised volatility. The adjusted R is the highest for put IV, closely 
followed by call IV. Historical volatility has the lowest adjusted R?. Therefore put 
IV is ranked first in explaining future realised volatility, closely followed by call IV, 
while historical volatility is the last. The null hypothesis that £ is not significantly 
different from one cannot be rejected at the 10% critical level for the two IV estimates, 
while it is strongly rejected for historical volatility. Therefore we can consider both 
IV estimates as unbiased after a constant adjustment given by the intercept of the 
regression. 

In encompassing regressions (3) we compare pairwise call/put IV forecast with 
historical volatility in order to understand if TV subsumes all the information contained 
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in historical volatility. The results are striking and provide strong evidence for both the 
unbiasedness and efficiency of both IV forecasts. First of all, from the comparison of 
univariate and encompassing regressions, the inclusion of historical volatility does not 
improve the goodness of fit according to the adjusted R*. In fact, the slope coefficient 
of historical volatility is not significantly different from zero at the 10% level in 
the encompassing regressions (3), indicating that both call and put IV subsume all 
the information contained in historical volatility. The slope coefficients of both call 
and put IV are not significantly different from one at the 10% level and the joint 
test of information content and efficiency (y = 0 and £ = 1) does not reject the 
null hypothesis, indicating that both IV estimates are efficient and unbiased after a 
constant adjustment. 

In order to see if put IV has more predictive power than call IV, we test in aug- 
mented regression (3) if y = 0 and 6 = 1. The joint test y = 0 and £ = 1 does not 
reject the null hypothesis. We see that the slope coefficient of put IV is significantly 
different from zero only at the 5% level, while the slope coefficient of call IV is not 
significantly different from zero. As an additional test we regress In(a,) on In(¢p) 
(In(o,) on In(o,)) and retrieve the residuals. Then we run univariate regression (2) 
for In(a-) (In(op)) using as an additional explanatory variable the residuals retrieved 
from the regression of In(o,) on In(o,) (In(ap) on In(@,)). The residuals are signifi- 
cant only in the regression of In(g; ) on In(¢,), pointing to the fact that put IV contains 
slightly more information on future realised volatility than call IV. 

A possible concern is the problem of data snooping, which occurs when the 
properties of a data set influence the choice of the estimator or test statistic (see 
e.g., [7]) and may arise in a multiple regression model, when a large number of 
explanatory variables are compared and the selection of the candidate variables is 
not based on a financial theory (e.g., in [20] 3654 models are compared to a given 
benchmark, in [18] 291 explanatory variables are used in a multiple regression). This 
is not the case in our regressions, since (i) we do not have any parameter to estimate, 
(ii) we use only three explanatory variables: historical volatility, call IV and put IV, 
that are compared pairwise in the regressions and (iii) the choice has been made on the 
theory that IV, being derived from option prices, is a forward-looking measure of ex 
post realised volatility and is deemed as the market’s expectation of future volatility. 
Finally, in order to test the robustness of our results and see if IV has been measured 
with errors, we adopt an instrumental variable procedure and run a two-stage least 
squares. The Hausman [11] specification test, reported in the last column of Table 2, 
indicates that the errors in variables problem is not significant in univariate regressions 
(2), in encompassing regressions (3) or in augmented regression (4).! Therefore we 
can trust the OLS regression results. 

In our sample both IV forecasts obtain almost the same performance, with put IV 
marginally better than call IV. These results are very different from the ones obtained 
both in [3] and in [9]. The difference can possibly be attributed to the option exercise 
feature, which in our case is European and not American, and to the underlying index 


1th augmented regression (4) the instrumental variables procedure is used for the variable 
In(@gp). 
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Table 2. OLS regressions 


Intercept In(o;) In(@p) In(op) Adj R2 DW x24 2 Hausman Test 


—-0.01 1.05% 0.77. 1.73 13.139 0.10021 
(0.915) (0.000) (0.00) 

—0.018 1.047%%* 0.76 1.77 13.139 0.25128 
(0.853) (0.000) (0.00) 

—0.29 0.82 0.65 2.12 7.517 

(0.008) (0.000) (0.02) 

—0.02 0.938% 0.10t+*+ 0.76 1.87 1.288 0.47115 
(0.850) (0.000) (0.400) (0.53) 

—0.01 0.9631*** 0.082t*++ 0.77. 1.80 1.158 0.95521 
(0.915) (0.000) (0.489) (0.56) 

0.0006 0.372 0.6861" 0.77. «1.74 2.04 0.14977 
(0.994) (0.244) (0.033) (0.35) 


“ Note: The numbers in brackets are the p-values. The yn column reports the statistic of a 
xr test for the joint null hypothesis a = 0 and £ = 1 in the following univariate regressions 
In(o,) = a+f In(o;) where o, = realized volatility and o;= volatility forecast,i = h,c, p. The 
x" column reports the statistic of a y? test for the joint null hypothesis y = O and f = 1 in the 
following regressions: In(o,) = a + f In(o;) + y In(op), Ino) =a + Bln(op) + y In@e), 
where o; = realized volatility, ¢;= volatility forecast, i = c, p and oy = historical volatility. 
The superscripts ***, **, * indicate that the slope coefficient is not significantly different from 
one at the 10%, 5% and 1% critical level respectively. The superscripts t+, ++, + indicate 
that the slope coefficient is not significantly different from zero at the 10%, 5% and 1% critical 
level respectively. The last column reports the Hausman [11] specification test statistic (one 
degree of freedom), where the 5% critical level is equal to 3.841. 


features, which in our case do not require the dividend payment estimation. Another 
possible explanation stems from the characteristics of the data set used. In particular 
in our case put IV was on average lower than call IV, while in [3] the opposite is true. 
As IV usually overpredicts realised volatility, if a choice has to be made between call 
and put IV, a rule of thumb can be to choose the lowest of the two. 


5 Conclusions 


In this paper we have investigated the relation between IV, historical volatility and 
realised volatility in the DAX index options market. Since IV varies across option type 
(call versus put), we have run ahorse race of different IV estimates: call IV, put IV. Two 
hypotheses have been tested: unbiasedness and efficiency of the different volatility 
forecasts. Our results suggest that both IV forecasts contain more information about 
future realised volatility than historical volatility. In particular, they are unbiased 
(after a constant adjustment) and efficient forecasts of realised volatility in that they 
subsume all the information contained in historical volatility. In our sample both IV 
forecasts obtain almost the same performance, with put IV marginally better than call 
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IV. This is an interesting result and is a warning against the a priori choice of using 
call IV. The recent turmoil in financial markets caused by the current financial crisis 
has determined high levels of volatility. High on the research agenda is to test the 
unbiasedness and efficiency hypotheses using the most recent volatility data. 
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Binomial algorithms for the evaluation of options on 
stocks with fixed per share dividends 


Martina Nardon and Paolo Pianca 


Abstract. We consider options written on assets which pay cash dividends. Dividend payments 
have an effect on the value of options: high dividends imply lower call premia and higher 
put premia. Recently, Haug et al. [13] derived an integral representation formula that can be 
considered the exact solution to problems of evaluating both European and American call 
options and European put options. For American-style put options, early exercise may be 
optimal at any time prior to expiration, even in the absence of dividends. In this case, numerical 
techniques, such as lattice approaches, are required. Discrete dividends produce discrete shift in 
the tree; as a result, the tree is no longer reconnecting beyond any dividend date. While methods 
based on non-recombining trees give consistent results, they are computationally expensive. 
In this contribution, we analyse binomial algorithms for the evaluation of options written on 
stocks which pay discrete dividends and perform some empirical experiments, comparing the 
results in terms of accuracy and speed. 


Key words: options on stocks, discrete dividends, binomial lattices 


1 Introduction 


We consider options written on assets which pay dividends. Dividends are announced 
as a pure cash amount D to be paid at a specified ex-dividend date tp. Empirically, 
one observes that at the ex-dividend date the stock price drops. Hence dividends imply 
lower call premia and higher put premia. In order to exclude arbitrage opportunities, 
the jump in the stock price should be equal to the size of the net dividend. Since we 
cannot use the proportionality argument, the price dynamics depend on the timing of 
the dividend payment. 

Usually, derivative pricing theory assumes that stocks pay known dividends, both 
in size and timing. Moreover, new dividends are often supposed to be equal to the 
former ones. Even if these assumptions might be too strong, in what follows we 
assume that we know both the amount of dividends and times in which they are paid. 

Valuation of options on stocks which pay discrete dividends is a rather hard 
problem which has received a lot of attention in the financial literature, but there is 
much confusion concerning the evaluation approaches. Different methods have been 
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proposed for the pricing of both European and American options on dividend paying 
stocks, which suggest various model adjustments (such as, for example, subtracting 
the present value of the dividend from the asset spot price). Nevertheless, all such 
approximations have some drawbacks and are not so efficient (see e.g., Haug [11] for 
a review). 

Haug and Haug [12] and Beneder and Vorst [2] propose a volatility adjustment 
which takes into account the timing of the dividend. The idea behind the approximation 
is to leave volatility unchanged before the dividend payment and to apply the adjusted 
volatility after the dividend payment. This method performs particularly poorly in the 
presence of multiple dividends. A more sophisticated volatility adjustment to be used 
in combination with the escrowed dividend model is proposed by Bos et al. [4]. The 
method is quite accurate for most cases. Nevertheless, for very large dividends, or in 
the case of multiple dividends, the method can yield significant mispricing. A slightly 
different implementation (see Bos and Vandermark [5]) adjusts both the stock price 
and the strike. The dividends are divided into two parts, called “near” and “far’, 
which are used for the adjustments to the spot and the strike price respectively. This 
approach seems to work better than the approximation mentioned above. Haug et 
al. [13] derive an integral representation formula that can be considered the exact 
solution to problems of evaluating both European and American call options and 
European put options. Recently, de Matos et al. [7] derived arbitrarily accurate lower 
and upper bounds for the value of European options on a stock paying a discrete 
dividend. 

For American-style put options, it can be optimal to exercise at any time prior to 
expiration, even in the absence of dividends. Unfortunately, no analytical solutions 
for both the option price and the exercise strategy are available, hence one is generally 
forced to numerical solutions, such as binomial approaches. As is well known (see 
Merton [14]), in the absence of dividends, it is never optimal to exercise an American 
call before maturity. If a cash dividend payment is expected during the lifetime of 
the option, it might be optimal to exercise an American call option right before the 
ex-dividend date, while for an American put it may be optimal to exercise at any point 
in time until maturity. 

Lattice methods are commonly used for the pricing of both European and Ameri- 
can options. In the binomial model (see Cox et al. [6]), the pricing problem is solved 
by backward induction along the tree. In particular, for American options, at each 
node of the lattice one has to compare the early exercise value with the continuation 
value. 

In this contribution, we analyse binomial algorithms for the evaluation of op- 
tions written on stocks which pay discrete dividends of both European and American 
types. In particular, we consider non-recombining binomial trees, hybrid binomial 
algorithms for both European and American call options, based on the Black-Scholes 
formula for the evaluation of the option after the ex-dividend date and up to maturity; a 
binomial method which implements the efficient continuous approximation proposed 
in [5]; and we propose a binomial method based on an interpolation idea given by 
Vellekoop and Nieuwenhuis [17], in which the recombining feature is maintained. 
The model based on the interpolation procedure is also extended to the case of multi- 
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ple dividends; this feature is very important for the pricing of long-term options and 
index options. We performed some empirical experiments and compare the results in 
terms of accuracy and speed. 


2 European-style options 


Dividends affect option prices through their effect on the underlying stock price. In 
a continuous time setting, the underlying price dynamics depends on the timing of 
the dividend payment and is assumed to satisfy the following stochastic differential 
equation 


dS, =rS,dt +o0S;dW, tAtp 


a (1) 

Si, = Sip Te D, D>? 
where S;,, and S. as denote the stock price levels right before and after the jump at time 
tp, respectively. Due to this discontinuity, the solution to equation (1) is no longer 
log-normal but in the form! 


S; = Spe Dito Ws _ D,pe 2 /2)¢—to) +4 Wi-tp Te>tp} . (2) 


Recently, Haug et al. [13] (henceforth HHL) derived an integral representation 
formula for the fair price of a European call option on a dividend paying stock. 
The basic idea is that after the dividend payment, option pricing reduces to a simple 
Black-Scholes formula for a non-dividend paying stock. Before tp one considers the 
discounted expected value of the BS formula adjusted for the dividend payment. In 
the geometric Brownian motion setup, the HHL formula is 


2 
ex /2 


J2a 


where d = 2O/Sd=W—o" Pog Sye"—27/2)t+2VID¥ and ce(Sx — D, tp) is 
simply the BS formula with time to maturity T — tp. The integral representation 
(3) can be considered as the exact solution to the problem of valuing a European 
call option written on stock with a discrete and known dividend. Let us observe that 
the well known put-call parity relationship allows the immediate calculation of the 


theoretical price of a European put option with a discrete dividend. 


CO 
CyHL(So, D, tp) = ey ce(Sx — D, tp) dx, (3) 
d 


3 American-style options 


Most traded options are of American style. The effect of a discrete dividend payment 
on American option prices is different than for European options. While for European- 
style options the pricing problem basically arises from mis-specifying the variance 


! 74 denotes the indicator function of A. 
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of the underlying process, for American options the impact on the optimal exercise 
strategy is more important. As is well known, it is never optimal to exercise an 
American call option on non-dividend paying stocks before maturity. As a result, the 
American call has the same value as its European counterpart. In the presence of 
dividends, it may be optimal to exercise the American call and put before maturity. 
In general, early exercise is optimal when it leads to an alternative income stream, 
i.e., dividends from the stock for a call and interest rates on cash for a put option. In 
the case of discrete cash dividends, the call option may be optimally exercised early 
instantaneously prior to the ex-dividend date,” tp; while for a put it may be optimal 
to exercise at any point in time till maturity. Simple adjustments like subtracting the 
present value of the dividend from the asset spot price make little sense for American 
options. 

The first approximation to the value of an American call on a dividend paying 
stock was suggested by Black in 1975 [3]. This is basically the escrowed dividend 
method, where the stock price in the BS formula is replaced by the stock price minus 
the present value of the dividend. In order to account for early exercise, one also 
computes an option value just before the dividend payment, without subtracting the 
dividend. The value of the option is considered to be the maximum of these values. 

A model which is often used and implemented in much commercial software was 
proposed, simplified and adjusted by Roll [15], Geske [8, 10] and Whaley [18] (RGW 
model). These authors construct a portfolio of three European call options which 
represents an American call and accounts for the possibility of early exercise right 
before the ex-dividend date. The portfolio consists of two long positions with exercise 
prices X and S* + D and maturities T and tp, respectively. The third option is a short 
call on the first of the two long calls with exercise price S*-+D—X and maturity tp. The 
stock price S* makes the holder of the option indifferent between early exercise at time 
tp and continuing with the option. Formally, we have C(S*, T—tp, X) = S*+D-X. 
This equation can be solved if the ex-dividend date is known. The two long positions 
follow from the BS formula, while for the compound option Geske [9] provides an 
analytical solution. 

The RGW model was considered for more than twenty years as a brilliant solution 
in closed form to the problem of evaluating American call options on equities that 
pay a discrete dividend. Although some authoritative authors still consider the RGW 
formula as the exact solution, the model does not yield good results in many cases 
of practical interest. Moreover, it is possible to find situations in which the use of the 
formula RGW allows for arbitrage. Whaley, in a recent monograph [19], presents an 
example that shows the limits of the RGW model. 

Haug et al. [13] derived an integral representation formula for the American call 
option fair price in the presence of a single dividend D paid at time tp. Since early 
exercise is only optimal instantaneously prior to the ex-dividend date, in order to 
obtain the exact solution for an American call option with a discrete dividend one can 


2 Note that after the dividend date tp, the option is a standard European call which can be 
priced using the BS formula; this idea can be implemented in a hybrid BS-binomial model. 
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merely replace relation (3) with 
gore 


V2n 


[oe) 
CyHL(So, D, tp) = ase max {S; — X, ce (Sx — D, tp)} dx. (4) 
d 


4 Binomial models 


The evaluation of options using binomial methods is particularly easy to implement 
and efficient at standard conditions, but it becomes difficult to manage in the case in 
which the underlying asset pays one or more discrete dividends, due to the fact that 
the number of nodes grows considerably and entails huge calculations. In the absence 
of dividends or when dividends are assumed to be proportional to the stock price, 
the binomial tree reconnects in the sense that the price after a up-down movement 
coincides with the price after a down-up movement. As a result, the number of nodes 
at each step grows linearly. 

If during the life of the option a dividend of amount Dis paid, at each node after the 
ex-dividend date a new binomial tree has to be considered (see Fig. 1), with the result 
that the total number of nodes increases to the point that it is practically impossible 
to consider trees with an adequate number of stages. To avoid such a complication, 
often it is assumed that the underlying dynamics are characterised by a dividend yield 
which is discrete and proportional to the stock price. Formally, 


Sou/di-J j=0,1,...i 


So — q)ujdi-i — j =0,1,...3, (6) 


where the first law applies if the period preceding the ex-dividend date and the second 
applies after the dividend date, and where So indicates the initial price, g is the dividend 
yield, and wu and d are respectively the upward and downward coefficients, defined 
by u = e7VT/" and d = 1 /u. The hypothesis of a proportional dividend yield can 
be accepted as an approximation of dividends paid in the long term, but it is not 
acceptable in a short period of time during which the stock pays a dividend in cash 
and its amount is often known in advance or estimated with appropriate accuracy. 

If the underlying asset is assumed to pay a discrete dividend D at time tp < T 
(which in a discrete time setting corresponds to the step np), the dividend amount 
is subtracted at all nodes at time point tp. Due to this discrete shift in the tree, as 
already noticed, the lattice is no longer recombining beyond the time fp and the 
binomial method becomes computationally expensive, since at each node at time 
tp a separate binomial tree has to be evaluated until maturity (see Fig. 1). Also, in 
the presence of multiple dividends this approach remains theoretically sound, but 
becomes unattractive due to the computational intensity. 

Schroder [16] describes how to implement discrete dividends in a recombining 
tree. The approach is based on the escrowed dividend process idea, but the method 
leads to significant pricing errors. 

The problem of the enormous growth in the number of nodes that occurs in such 
a case can be simplified if it is assumed that the price has a stochastic component S$ 
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Fig. 1. Non-recombining binomial tree after the dividend payment 


given by 
eee ee De~"to-)) cSt 
~ 1s i>tp, 


(6) 


and a deterministic component represented by the discounted value of the dividend or 
of dividends that will be distributed in the future. Note that the stochastic component 
gives rise to a reconnecting tree. Moreover, you can build a new tree (which is still 
reconnecting) by adding the present value of future dividends to the price of the 
stochastic component in correspondence of each node. Hence the tree reconnects and 
the number of nodes in each period i is equal toi + 1. 

The recombining technique described above can be improved through a procedure 
that preserves the structure of the tree until the ex-dividend time and that will force the 
recombination after the dividend payment. For example, you can force the binomial 
tree to recombine by taking, immediately after the payment of a dividend, as extreme 
nodes 


Snp+1,0 = (Snp,0 _ D)d Snp,np = (Snp.np _ D)u > (7) 


and by calculating the arithmetic average of the values that are not recombining. This 
technique has the characteristic of being simple from the computational point of view. 

Alternatively, you can use a technique, called “stretch”, that calculates the extreme 
nodes as in the previous case; in such a way, one forces the reconnection at the 
intermediate nodes by choosing the upward coefficients as follows 


ui, j) = e22VT™ | (8) 


where 4 is chosen in order to make equal the prices after an up and down movement. 
This technique requires a greater amount of computations as at each stage both the 
coefficients and the corresponding probabilities change. 
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In this paper, we analyse a method which performs very efficiently and can be 
applied to both European and American call and put options. It is a binomial method 
which maintains the recombining feature and is based on an interpolation idea pro- 
posed by Vellekoop and Nieuwenhuis [17]. 

For an American option, the method can be described as follows: a standard 
binomial tree is constructed without considering the payment of the dividend (with 
Sij = Sould'-J,u = eoVT/n. and d = 1/u), then it is evaluated by backward 
induction from maturity until the dividend payment; at the node corresponding to an 
ex-dividend date (at step np), we approximate the continuation value V;,,, using the 
following linear interpolation 


V (Snap k+1) = V(Snp,k) 


V(Snp,j) = 
( mpi) Shp k+l = Snp.k 


(Snp.j — Snp.k) + V(Snp,k) » (9) 


for 7 = 0,1,...,mp and Spy k < Snp,j < Snp,k+1; then we continue backward 
along the tree. The method can be easily implemented also in the case of multiple 
dividends (which are not necessarily of the same amount). 

We have implemented a very efficient method which combines this interpolation 
procedure and the binomial algorithm for the evaluation of American options proposed 
by Basso et al. [1].3 

We performed some empirical experiments and compare the results in terms of 
accuracy and speed. 


5 Numerical experiments 


In this section, we briefly report the results of some empirical experiments related 
to European calls and American calls and puts. In Table 1, we compare the prices 
provided by the HHL exact formula for the European call, with those obtained with 
the 2000-step non-combining binomial method and the binomial method based on 
interpolation (9). We also report the results obtained with the approximation proposed 
by Bos and Vandermark [5] (BV). For a European call, the non-recombining binomial 
method requires a couple of seconds, while the calculations with a 2000-step binomial 
interpolated method are immediate. 

Table 2 shows the results for the American call and put options. We have compared 
the results obtained with non-recombining binomial methods and the 10,000-step 
binomial method based on the interpolation procedure (9). In the case of the American 
put, the BV approximation leads to considerable pricing errors. 

We also extended the model based on the interpolation procedure to the case of 
multiple dividends. Table 3 shows the results for the European call with multiple 


3 The algorithm exploits two devices: (1) the symmetry of the tree, which implies that all the 
asset prices defined in the lattice at any stage belong to the set {Sou/ > yj =n, —n+ 
1,...,0,...,2 — 1,7}, and (2) the fact that in the nodes of the early exercise region, the 
option value, equal to the intrinsic value, does not need to be recomputed when exploring 
the tree backwards. 
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Table 1. European calls with dividend D = 5 (Sg = 100, T = 1,r = 0.05, 0 = 0.2) 


tp x HAL Non-rec. bin. Interp. bin. BV 
(n = 2000) (n = 2000) 

70 28.7323 28.7323 28.7324 28.7387 
0.25 100 7.6444 7.6446 7.6446 7.6456 
130 0.9997 0.9994 1.000 0.9956 
70 28.8120 28.8120 28.8121 28.8192 
0.5 100 7.7740 7.7742 7.7742 7.7743 
130 1.0501 1.0497 1.0506 1.0455 
70 28.8927 28.8927 28.8928 28.8992 
0.75 100 7.8997 7.8999 7.8999 7.9010 
130 1.0972 1.0969 1.0977 1.0934 


Table 2. American call and put options with dividend D = 5 (Sg = 100, T = 1, r = 0.05, 
o = 0.2) 


American Call American put 
tp X ~non-rec. hyb. bin. interp. bin. Non-rec. bin. —_interp. bin. BV 
(n = 5000) (n= 10,000) (n = 2000) (n = 10,000) 
70 30.8740 30.8744 0.2680 0.2680 0.2630 
0.25 100 7.6587 7.6587 8.5162 8.5161 8.5244 
130 0.9997 0.9998 33.4538 33.4540 350112 
70 31.7553 31.7557 0.2875 0.2876 0.2901 
0.5 100 8.1438 8.1439 8.4414 8.4412 8.5976 
130 1.0520 1.0522 32.1195 32.1198 35.0112 
70 32.6407 32.6411 0.3070 0.3071 0.2901 
0.75 100 9.1027 9.1030 8.2441 8.2439 8.6689 
130 1.1764 1.1767 30.8512 30.8515 35.0012 


dividends. We have compared the non-reconnecting binomial method with n = 2000 
steps (only for the case with one and two dividends) and the interpolated binomial 
method with n = 10,000 steps (our results are in line with those obtained by Haug 
et al. [13]). 

Table 4 shows the results for the American call and put options for different 
maturities in the interpolated binomial method with multiple dividends. 


6 Conclusions 


The evaluation of the options on stocks that pay discrete dividends was the subject of 
numerous studies that concerned both closed-form formula and numerical approxi- 
mate methods. Recently, Haug et al. [13] proposed an integral expression that allows 
the calculation of European call and put options and American call options in precise 
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Table 3. European call option with multiple dividends D = 5 paid at times tp € 
{0.5, 1.5, 2.5, 3.5, 4.5, 5.5}, for different maturities T = 1,...,6 (Sg = 100, X = 100, 
r = 0.05, 0 = 0.2) 


T Non-rec. bin. Interp. bin. 
(n = 2000) (n = 10,000) 
1 7.7742 7.7741 
2 10.7119 10.7122 
3 12.7885 
4 14.4005 
5 15.7076 
6 16.7943 


Table 4. American options with multiple dividends in the interpolated 10,000-step binomial 
method (with parameters Sg = 100, X = 100, r = 0.05, 0 = 0.2); a cash dividend D = 5 is 
paid at the dates tp € {0.5, 1.5, 2.5, 3.5, 4.5, 5.5}, for different maturities T= 1,...,6 


T American call American put 
1 8.1439 8.4412 
2 11.2792 11.5904 
3 13.3994 13.7399 
4 15.0169 15.3834 
5 16.3136 16.7035 
6 17.3824 17.7938 


terms. The formula proposed by Haug et al. requires the calculation of an integral. 
Such an integral representation is particularly interesting because it can be extended 
to the case of non-Brownian dynamics and to the case of multiple dividends. 

The pricing of American put options written on stocks which pay discrete dividend 
can be obtained with a standard binomial scheme that produces very accurate results, 
but it leads to non-recombining trees and therefore the number of nodes does not grow 
linearly with the number of steps. 

In this contribution, we implemented alternative methods to the classical bino- 
mial approach for American options: a hybrid binomial-Black-Scholes algorithm, a 
binomial method which translates the continuous approximation proposed in [5] and 
a binomial method based on an interpolation procedure, in which the recombining 
feature is maintained. We performed some empirical experiments and compared the 
results in terms of accuracy and efficiency. In particular, the efficient implementation 
of the method based on interpolation yields very accurate and fast results. 
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Nonparametric prediction in time series analysis: some 
empirical results 


Marcella Niglio and Cira Perna 


Abstract. In this paper a new approach to select the lag p for time series generated from 
Markov processes is proposed. It is faced in the nonparametric domain and it is based on 
the minimisation of the estimated risk of prediction of one-step-ahead kernel predictors. The 
proposed procedure has been evaluated through a Monte Carlo study and in empirical context 
to forecast the weakly 90-day US T-bill secondary market rates. 


Key words: kernel predictor, estimated risk of prediction, subsampling 


1 Introduction 


One of the aims in time series analysis is forecasting future values taking advantage 
of current and past knowledge of the data-generating processes. These structures 
are often summarised with parametric models that, based on specific assumptions, 
define the relationships among variables. In this parametric context a large number 
of models have been proposed (among others, [3], [20], [4], and, more recently, [11], 
which discusses parametric and nonparametric methods) and for most of them the 
forecast performance has been evaluated. 

To overcome the problem of prior knowledge about the functional form of the 
model, a number of nonparametric methods have been proposed and widely used 
in statistical applications. In this context, our attention is focused on nonparametric 
analysis based on kernel methods which have received increasing attention due to 
their flexibility in modelling complex structures. 

In particular, given a Markov process of order p, in this paper a new approach to 
select the lag p is proposed. It is based on the minimisation of the risk of prediction, 
proposed in [13], estimated for kernel predictors by using the subsampling. 

After presenting some results on the kernel predictors, we discuss, in Section 2, 
how they can be introduced in the proposed procedure. 

In Section 3 we further describe the algorithm whose performance has been dis- 
cussed in a Monte Carlo study. To evaluate the forecast accuracy of the nonparametric 
predictor in the context of real data, in Section 4 we present some results on the weekly 
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90-day US T-bill secondary market rates. Some concluding remarks are given at the 
end. 


2 Nonparametric kernel predictors 


Let Xr = {X 1, X2,..., X7} be a real-valued time series. We assume that: 
Al: X7 is a realisation of a strictly stationary stochastic process that belongs to 
the class: 
X, = f(Xi-1, X1-2,..., Xt-p) +, 1<t<T, (1) 


where the innovations er} are i.d. random variables, independent from the past of 
X,, with E(e,) = 0, E(e? ) =o? <-+o0, and p is a nonnegative integer. 

In class (1), f (X;-1, X1-2, ..., X+—p) is the conditional expectation of X;, given 
X;-1,..., X;—p, that can be differently estimated. When the Nadaraya-Watson (N- 
W)-type estimator (among others see [2]) is used: 


3 Th « (5 ches) x, 


x t=p+li=l 
fi, x2, ...,Xp) = (2) 
—X 
> il K (52 t= ‘) 
t=p+li= 


where K(-) is a kernel function and h; is the bandwidth, fori = 1,2,..., p. 

Under mixing conditions, the asymptotic properties of the estimator (2) have been 
widely investigated in [18], and the main properties, when it is used in predictive 
context, have been discussed in [9] and [2]. 

When the estimator (2) is used, the selection of the “optimal” bandwidth, the 
choice of the kernel function and the determination of the autoregressive order p are 
needed. To solve the latter problem, many authors refer to automatic methods, such 
as AIC and BIC, or to their nonparametric analogue suggested by [19]. 

Here we propose a procedure based on one-step-ahead kernel predictors. 

The estimator for the conditional mean (2) has a large application in prediction 
contexts. In fact, when a quadratic loss function is selected to find a predictor for 
Xr+¢, with lead time € > 0, it is well known that the best predictor is given by 
Xr+¢ = E[X7+¢|X7], obtained from the minimisation of 


argmin E[(Xr4e—Xrse)|Xr], with €>0. 
XrsceR 


It implies that when N-W estimators are used to forecast X 7+, the least-squares 
predictor X 7+¢ becomes: 
P Xi 
> Oe (258+) Xie 
t=p+li=1 
aap (3) 


Xr+e = 


5 fe ee) 


t=p+li=1 


Nonparametric prediction in time series analysis 237 


The properties of (3) in the presence of strictly stationary and Markovian processes 
of order p are discussed in [9] and [2]. 

Under well defined assumptions on the generating process, [9] shows that when 
€ = | the predictor (3) is a strong consistent estimator for E[X7+1|X7] and this 
result has been subsequently generalised in [2] to the case with € > 1. 

In presence of real data, [5], [16], [10] and recently [23] evaluate the forecast 
accuracy of (3) and give empirical criteria to define confidence intervals for x T+e- 

In the following, taking advantage of (3), we propose the use of the estimated risk 
of prediction (ERP), discussed in [13], to select the order p of the the autoregres- 
sion (1). 

In particular we further assume that: 

A2: Xz is arealisation of a strong mixing (or a-mixing) process. Under conditions 
A1 and A2, the ERP can be estimated through resampling techniques and in particular 
using the subsampling approach as proposed by [13]. 

The subsampling has a number of interesting advantages with respect to other 
resampling techniques: in particular it is robust against misspecified models and gives 
consistent results under weak assumptions. 

This last remark makes the use of subsampling particularly useful in a nonpara- 
metric framework and can be properly applied in the context of model selection. 


Let X 7+1 be the N-W predictor (3); its mean square error is defined as 
Ar = E[(Xryi — X14)’. 


The algorithm we propose to select p is established on the estimation of Av that, 
as described in Procedure 1, is based on the overlapping subsampling. Note that in 
this procedure Step 2 implies the choice of the subsample length b. A large number 
of criteria have been proposed in the statistical literature to select b (inter alia [17]). 
Here we refer to the proposal in [14], which describes an empirical rule for estimating 
the optimal window size in the presence of dependent data of smaller length (m) than 
the original (T). The details are given in Procedure 2. 


Procedure 1: Selection of the autoregressive order p 


1. Choose a grid for p € (1,..., P). 
2. Select the subsample length b (Procedure 2). 
3. For each p, compute the estimated risk of prediction (ERP): 


T-b 
is yr 2 
Are = b+! > (29, - Xi46) ; 
i=0 
where X We p 1s the one-step-ahead predictor (3) of X;+», based on the subsample 
(Xj41, Xi42,---5 Xj+p-1) of length b. 
4. Select p which minimises Avy. 
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Procedure 2: Subsample length selection 


1. Fix m < T and compute Xx T,m, the subsampling estimate from the entire data 


set Xr. 
2. For all bm < m, compute x the subsampling estimate of the forecast 
computed from (Xi, pS ety ee Cae ae 


3. Select the value b,, that minimizes the estimated mean square error (EMSE): 


T-—m+1 fs ‘ 5) 
EMSE(m, bm) =(T—m+1~"! > (205, = Xrm) ; 


i=l 


4. Choose 6 = (T/m)? * bm, where 5 € (0, 1) is a fixed real number. 


3 Simulation results 


To illustrate the performance of the proposed procedure we have used simulated data 
sets generated by models with known structure. The aim is to evaluate the ability 
of our procedure to select a proper value for the autoregressive parameter p in the 
presence of given data-generating processes. 

The simulated time series have been generated by two structures: a linear autore- 
gressive model (AR) and a self-exciting threshold autoregressive model (SETAR) 
that, as is well known, both belong to the class of Markov processes (1). 

More precisely the simulated models are: 

Model 1 - ARC): X; = —0.8X;~1 + €, with e, ~ N(O, 1); 
Model 2 - SETAR(2;1,1): 
ite les —O9X1+6 X10 vin Q.ty, 
—0.4—0.6X;-1 +e X-1> 9, 
where Model 2 has been used in [21] to evaluate the forecast ability of SETAR models. 

The simulation study has been implemented defining a grid value for p = 1, 2, 3, 4 
and using series of length T = 70 and T = 100. 

In order to take into account the two different lengths, we have chosen two grids 
form. When T = 70, the grid is m = {20, 25, 30, 35} whereas for T = 100 it is 
m = {25, 35, 40, 50}. The two values for T have been chosen to evaluate the proposed 
procedure in the presence of series of moderate length whereas the grid for m has 
been defined following [14]. 

Starting from these values, Procedure | has been run in a Monte Carlo study with 
100 replications. Following [13] we have fixed the parameter 6 = 0.4 whereas the 
kernel function is Gaussian and the bandwidths h; (i = 1, 2,..., p)in(3) are selected 
using a cross-validation criterion. 

The results are summarised in Tables | and 2 where the distribution of the 100 
simulated series is presented for the AR(1) and SETAR(2; 1,1) models respectively, 
comparing the classes in which b lies and the candidate values for D. 


Nonparametric prediction in time series analysis 239 


Table 1. Distribution of the 100 series of length T = 70 and T = 100 respectively, simulated 
from Model 1 


6 (T =70) 6 (T = 100) 
p [9,21] [22,34] [35,43] Tot. p [9,29] [30,44] [45,64] Tot. 


1 14 23 51 88 1 19 21 50 90 
2 1 4 2 7 2 1 3 1 5 
3 4 0 0 4 3 2 2 1 5 
4 0 1 0 1 4 0 0 0 0 


Table 2. Distribution of the 100 series of length T = 70 and T = 100 respectively, simulated 
from Model 2 


6 (T =70) 6 (T = 100) 
p [17,26] (27, 35] [36,44] Tot. p [15,31] (32, 47] [48, 64] Tot. 


1 15 11 59 85 1 21 14 54 89 
2 1 0 4 5 2 2 2 3 7 
3 3 6 1 10 3 1 0 3 4 
4 0 0 0 0 4 0 0 0 0 


In both cases, the proposed procedure gives satisfactory results on the selection 
of the autoregressive order in the presence of a Markov process of order | that, as 
expected, improves as T grows. 

Note that the good performance is a guarantee for time series of moderate length 
T, that rises the interest on the procedure. 

As expected, most “well selected” models belong to the last class of b. It should 
not be surprising because the results used in the proposed procedure are mainly given 
in asymptotic context. 


4 Empirical results on 90-day US T-bill rate 


Starting from the theoretical results described in Section 2, the model selection pro- 
cedure has been applied to generate forecasts from the weekly 90-day US T-bill 
secondary market rates covering the period 4 January 1957—17 December 1993. The 
time series, X;, of length T = 1929, has been extracted from the H.15 release of the 
Federal Reserve System (http://www.federalreserve.gov/releases/h15/data.htm). 
The 90-day US T-bill has been widely investigated in nonlinear and nonpara- 
metric literature (among others [1] and [15]) and, in particular, the data set under 
study, plotted in Figure 1, has been analysed in [10] to compare three kernel-based 
multi-step predictors. The authors, after computing proper unit-root tests, show the 
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nonstationarity of the series X;, which can be empirically appreciated by observing 
the correlogram in Figure 2, which presents a very slow decay. 
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Fig. 1. Weekly 90-day US T-bill secondary market rates: 4 January 1957-17 December 1993 
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Fig. 2. ACF plot of the weakly 90-day US T-bill 


In Table 4 the nonlinearity of 7; is further investigated through the likelihood ratio 
(LR) test proposed in [6] and [7], where the linearity of the process is tested against 
threshold nonlinearity. In particular, the test statistic with the corresponding p-value 
is presented when the null autoregressive model of order p and the threshold delay 
d (of the alternative hypothesis) allow refusal of the linearity of the data-generating 
process. 
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The first differences of X; (denoted by 7 in the following) are then plotted in 
Figure 3 where it is evident that the behaviour of the series changes considerably in 
the time interval taken into account. 

Following [10], which further assesses the nonlinearity of the data-generating 
process of r;, we examine the conditional mean of 7;, neglecting the conditional 
heteroschedasticity that gives rise to the volatility clustering that can be clearly seen 
in Figure 3. 

Starting from these results, we firstly evaluate some features of the series using 
the descriptive indexes presented in Table 4. In particular, the mean, the median, the 
standard deviation, the skewness and kurtosis (given as third and fourth moment of 
the standardised data respectively) of r;, are computed. As expected, the distribution 
of r; has null median and shows negative skewness and heavy tails. 

It is widely known that when the prediction of the mean level of asymmetric time 
series needs to be generated, a parametric structure that can be properly applied is 
the SETAR(2; pi, p2) model that treats positive and negative values of r; differently. 
This is the reason why the SETAR models have been widely applied to analyse and 
forecast data related to financial markets (among others: [20] for a wide presentation 
of the model and [12] for its application to financial data). 


Table 3. Descriptive indexes of r; 


Mean Median S.D. Skewness Kurtosis 
Tt —6.224e-05 0 0.2342 —0.5801 16.6101 
aes 
ei 
ay = 
T T T T 
c SCC 100 15 2ccc 


Fig. 3. First differences of the weakly 90-day US T-bill rate (r;) 
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Table 4. LR linearity test of r; 


Pp d Stat (p-value) 
rt 4 1 30.5996 (1.9343e-05) 


The results of the LR test show that a linear structure does not seem to be capabel 
of catching the structure of the generating process (this explains the poor performance 
of the autoregressive forecasts in [10]). The threshold autoregressive model fitted to 
the data is clearly based on a strict parametric structure from which the forecasts are 
generated. 

Here, we alternatively propose the nonparametric predictor (3), which is more 
flexible than that generated from SETAR models, and whose Markov order is selected 
following Procedure 1. 

For both approaches, we have generated one-step-ahead, out-of-sample forecasts 
following an expanding window algorithm over the forecast horizon L = 26, which 
corresponds to the last six months of the time interval under analysis. 

Further, a null threshold value has been fixed for the SETAR model (with threshold 
delay given in Table 4) and at each iteration the model has been estimated following 
[22]. 

SETAR and nonparametric least-squares forecasts have been evaluated using the 
mean square error and the mean absolute error, MSE(L) = L7! ee ix T+i — 
Xrsi)? and MAE(L) = E>! pear IX ri — Xr+j;|, whose values are compared in 
Table 5 where the MSE (and the MAE) of (3) over the MSE (and MAB) of the SETAR 
predictions are shown. 


Table 5. MSE (and MAE) of the nonparametric forecasts over the MSE (and MAE) of the 
SETAR forecasts 
MSE(L)np[MSE(L) thr 1! MAE(L)np[MAE(L) thr 1! 


rt 0.839081 0.944058 


The better forecast accuracy, in terms of MSE and MAE, of predictor (3) can be 
appreciated. It further confirms the good performance of the proposed procedure in the 
presence of one-step-ahead forecasts. Moreover, the forecast accuracy seems not to 
be affected when different values, of moderate size, are assigned to m in Procedure 2. 


5 Conclusions 
We have proposed a procedure to select the order p in the presence of strictly stationary 


Markov processes (1). It is based on the use of one-step-ahead predictors generated 
from nonparametric Nadaraya- Watson kernel smoothers. 
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The selection of p is obtained from the minimisation of a quadratic loss function 
that makes use of the subsampling estimate of the one-step-ahead forecasts as shown 
in Procedure 1. 

The simulated and empirical results show the good performance of the proposed 
procedure that can be considered, in the context of model selection, an alternative to 
more consolidated approaches given in the literature. 

Much remains to be done: to investigate the properties of p; to generalise the 
procedure to the case with lead time £ > 1; to consider more complex data-generating 
processes that belong to the Markov class. Further, the procedure could be extended 
to parametric and/or semiparametric predictors that can be properly considered to 
minimize Arp. 

All these tasks need proper evaluation of the computational effort that is requested 
when computer-intensive methods are selected. 


Acknowledgement. The authors would like to thank two anonymous referees for their useful 
comments. 


References 


1. Barkoulas, J.T., Baum C.F., Onochie, J.: A nonparametric investigation of the 90-day T-bill 
rate. Rev. Finan. Econ. 6, 187-198 (1997) 
2. Bosq, D.: Nonparametric Statistics for Stochastic Process. Springer, New York (1996) 
3. Box, G.E.P., Jenkins, G.M.: Time Series Analysis: Forecasting and Control. Holden-Day, 
San Francisco (1976) 
4. Brockwell, P.J., Davies, R.A.: Time series: theory and methods. Springer-Verlag, New 
York (1991) 
5. Carbon, M., Delecroix M.: Non-parametric vs parametric forecasting in time series: a 
computational point of view. Appl. Stoch. Models Data Anal. 9, 215-229 (1993) 
6. Chan, K.S.: Testing for threshold autoregression. Ann. Stat. 18, 1886-1894 (1990) 
7. Chan, K.S., Tong H.: On likelihood ratio test for threshold autoregression, J. R. Stat. Soc. 
(B) 52, 595-599 (1990) 
8. Clements, M.P.: Evaluating Econometric Forecasts of Economic and Financial Variables. 
Palgave Macmillan, New York (2005) 
9. Collomb, G.: Propriétés de convergence presque compléte du predicteur a noyau. 
Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte Genbeite 66, 441-460 (1984) 
10. De Goojjer, J., Zerom, D.: Kernel-based multistep-ahead predictions of the US short-term 
interest rate. J. Forecast. 19, 335-353 (2000) 
11. Fan, J., Yao, Q.: Nonlinear Time Series. Nonparametric and Parametric Methods. Springer- 
Verlag, New York (2003) 
12. Franses P.H., van Dijk, D.: Non-Linear Time Series Models in Empirical Finance. Cam- 
bridge University Press, Cambridge (2000) 
13. Fukuchi, J.: Subsampling and model selection in time series analysis. Biometrika 86, 
591-604 (1999) 
14. Hall, P., Jing B.: On sample reuse methods for dependent data. J. R. Stat. Soc. (B) 58, 
727-737 (1996) 


244 


15. 


16. 


17. 
18. 


19. 


20. 


21. 


22; 


23. 


Marcella Niglio and Cira Perna 


Lanne, M., Saikkonen, P.: Modeling the U.S. short-term interest rate by mixture autore- 
gressive processes. J. Finan. Econ. 1, 96-125 (2003) 

Matzner-Lgber, E., De Gooijer, J.: Nonparametric forecasting: a comparison of three 
kernel-based methods. Commun. Stat.: Theory Methods 27, 1593-1617 (1998) 

Politis D.N., Romano, J.P., Wolf, M.: Subsampling. Springer-Verlag, New York (1999) 
Robinson, P.M.: Nonparametric estimators for time series. J. Time Ser. Anal. 4, 185-207 
(1983) 

Tj@stheim, D., Auestad, H.: Nonparametric identification of nonlinear time series: selecting 
significant lags. J. Am. Stat. Assoc. 89, 1410-1419 (1994) 

Tong, H.: Nonlinear Time Series: A Dynamical System Approach. Oxford University 
Press, Oxford (1990) 

Tong , H., Moeannadin, R.: On multi-step non-linear least squares prediction. Statistician 
37, 101-110 (1981) 

Tsay, R.: Testing and modelling threshold autoregressive processes. J. Am. Stat. Assoc. 
84, 231-240 (1989) 

Vilar-Fernandez, J.M., Cao, R.: Nonparametric forecasting in time series. A comparative 
study. Commun. Stat.: Simul. Comput. 36, 311-334 (2007) 


On efficient optimisation of the CVaR and 
related LP computable risk measures 
for portfolio selection 


Wiodzimierz Ogryczak and Tomasz Sliwiriski 


Abstract. The portfolio optimisation problem is modelled as a mean-risk bicriteria optimi- 
sation problem where the expected return is maximised and some (scalar) risk measure is 
minimised. In the original Markowitz model the risk is measured by the variance while several 
polyhedral risk measures have been introduced leading to Linear Programming (LP) com- 
putable portfolio optimisation models in the case of discrete random variables represented by 
their realisations under specified scenarios. Recently, the second order quantile risk measures 
have been introduced and become popular in finance and banking. The simplest such measure, 
now commonly called the Conditional Value at Risk (CVaR) or Tail VaR, represents the mean 
shortfall at a specified confidence level. The corresponding portfolio optimisation models can 
be solved with general purpose LP solvers. However, in the case of more advanced simulation 
models employed for scenario generation one may get several thousands of scenarios. This 
may lead to the LP model with a huge number of variables and constraints, thus decreasing 
the computational efficiency of the model. We show that the computational efficiency can be 
then dramatically improved with an alternative model taking advantages of the LP duality. 
Moreover, similar reformulation can be applied to more complex quantile risk measures like 
Gini’s mean difference as well as to the mean absolute deviation. 


Key words: risk measures, portfolio optimisation, computability, linear programming 


1 Introduction 


In the original Markowitz model [12] the risk is measured by the variance, but sev- 
eral polyhedral risk measures have been introduced leading to Linear Programming 
(LP) computable portfolio optimisation models in the case of discrete random vari- 
ables represented by their realisations under specified scenarios. The simplest LP 
computable risk measures are dispersion measures similar to the variance. Konno 
and Yamazaki [6] presented the portfolio selection model with the mean absolute 
deviation (MAD). Yitzhaki [25] introduced the mean-risk model using Gini’s mean 
(absolute) difference as the risk measure. Gini’s mean difference turn out to be a 
special aggregation technique of the multiple criteria LP model [17] based on the 
pointwise comparison of the absolute Lorenz curves. The latter leads to the quantile 
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shortfall risk measures that are more commonly used and accepted. Recently, the 
second-order quantile risk measures have been introduced in different ways by many 
authors [2,5, 15, 16,22]. The measure, usually called the Conditional Value at Risk 
(CVaR) or Tail VaR, represents the mean shortfall at a specified confidence level. 
Maximisation of the CVaR measures is consistent with the second-degree stochastic 
dominance [19]. Several empirical analyses confirm its applicability to various finan- 
cial optimisation problems [1,10]. This paper is focused on computational efficiency 
of the CVaR and related LP computable portfolio optimisation models. 

For returns represented by their realisations under T scenarios, the basic LP model 
for CVaR portfolio optimisation contains T auxiliary variables as well as T corre- 
sponding linear inequalities. Actually, the number of structural constraints in the LP 
model (matrix rows) is proportional to the number of scenarios 7, while the number 
of variables (matrix columns) is proportional to the total of the number of scenarios 
and the number of instruments T +n. Hence, its dimensionality is proportional to the 
number of scenarios J. It does not cause any computational difficulties for a few hun- 
dred scenarios as in computational analysis based on historical data. However, in the 
case of more advanced simulation models employed for scenario generation one may 
get several thousands of scenarios [21]. This may lead to the LP model with a huge 
number of auxiliary variables and constraints, thus decreasing the computational effi- 
ciency of the model. Actually, in the case of fifty thousand scenarios and one hundred 
instruments the model may require more than half an hour of computation time [8] 
with the state-of-art LP solver (CPLEX code). We show that the computational ef- 
ficiency can be then dramatically improved with an alternative model formulation 
taking advantage of the LP duality. In the introduced model the number of structural 
constraints is proportional to the number of instruments n, while only the number of 
variables is proportional to the number of scenarios 7, thus not affecting the sim- 
plex method efficiency so seriously. Indeed, the computation time is then below 30 
seconds. Moreover, similar reformulation can be applied to the classical LP portfo- 
lio optimisation model based on the MAD as well as to more complex quantile risk 
measures including Gini’s mean difference [25]. 


2 Computational LP models 


The portfolio optimisation problem considered in this paper follows the original 
Markowitz’ formulation and is based on a single period model of investment. At 
the beginning of a period, an investor allocates the capital among various securi- 
ties, thus assigning a nonnegative weight (share of the capital) to each security. Let 


J = {1,2,...,n} denote a set of securities considered for an investment. For each 
security j € J, its rate of return is represented by a random variable R; witha given 
mean uj = E{R;}. Further, let x = (x;)j=1,2,...,.. denote a vector of decision vari- 


ables x; expressing the weights defining a portfolio. The weights must satisfy a set 
of constraints to represent a portfolio. The simplest way of defining a feasible set P 
is by a requirement that the weights must sum to one and they are nonnegative (short 
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sales are not allowed), i.e., 
n 
Pale sD Sls ey org 1. (1) 
j=l 


Hereafter, we perform detailed analysis for the set P given with constraints (1). 
Nevertheless, the presented results can easily be adapted to a general LP feasible set 
given as a system of linear equations and inequalities, thus allowing one to include 
short sales, upper bounds on single shares or portfolio structure restrictions which 
may be faced by a real-life investor. 

Each portfolio x defines a corresponding random variable Ry = baa Rj x; that 
represents the portfolio rate of return while the expected value can be computed as 
u(x) = pee | #jx;. We consider T scenarios with probabilities p, (where t = 
1,..., 7). We assume that for each random variable KR; its realisation r;; under the 
scenario t is known. Typically, the realisations are derived from historical data treating 
T historical periods as equally probable scenarios (p; = 1/7). The realisations of 
the portfolio return Ry, are given as y; = iat I jtXj. 

Let us consider a portfolio optimisation problem based on the CVaR measure op- 
timisation. With security returns given by discrete random variables with realisations 
rjr, following [1,9, 10], the CVaR portfolio optimisation model can be formulated as 
the following LP problem: 


T 
1 
maximise 7 — z ys Drdt 


s.t. Se 1 xj 20 forj=l1,...,n (2) 
j=l 


n 
dnt > rjrx; +0, ds 0 fort 1.55.7, 
j=l 


where 7 is an unbounded variable. Except for the core portfolio constraints (1), model 
(2) contains T nonnegative variables d; plus a single 7 variable and T corresponding 
linear inequalities. Hence, its dimensionality is proportional to the number of scenar- 
ios T. Exactly, the LP model contains T + n + 1 variables and T + 1 constraints. 
For a few hundred scenarios, as in typical computational analysis based on historical 
data [11], such LP models are easily solvable. However, the use of more advanced 
simulation models for scenario generation may result in several thousands of sce- 
narios. The corresponding LP model (2) contains then a huge number of variables 
and constraints, thus decreasing its computational efficiency dramatically. If the core 
portfolio constraints contain only linear relations, like (1), then the computational 
efficiency can easily be achieved by taking advantage of the LP dual model (2). The 
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LP dual model takes the following form: 


minimise q 
T 
s.t. q— >i rim >0O forj=1,...,” 
t=1 (3) 


The dual LP model contains T variables u;, but the T constraints corresponding to 
variables d; from (2) take the form of simple upper bounds (SUB) on u; thus not 
affecting the problem complexity (c.f., [13]). Actually, the number of constraints in 
(3) is proportional to the total of portfolio size n, thus it is independent from the 
number of scenarios. Exactly, there are T + | variables and n + 1 constraints. This 
guarantees a high computational efficiency of the dual model even for a very large 
number of scenarios. Note that introducing a lower bound on the required expected 
return in the primal portfolio optimisation model (2) results only in a single additional 
variable in the dual model (3). Similarly, other portfolio structure requirements are 
modelled with a rather small number of constraints, thus generating a small number 
of additional variables in the dual model. 

We have run computational tests on 10 randomly generated test instances devel- 
oped by Lim et al. [8]. They were originally generated from a multivariate normal 
distribution for 50 or 100 securities with the number of scenarios of 50,000 just pro- 
viding an adequate approximation to the underlying unknown continuous price dis- 
tribution. Scenarios were generated using the Triangular Factorization Method [24] 
as recommended in [3]. All computations were performed on a PC with a Pentium 4 
2.6 GHz processor and | GB RAM employing the simplex code of the CPLEX 9.1 
package. An attempt to solve the primal model (2) with 50 securities resulted in 2600 
seconds of computation (much more than reported in [8]). On the other hand, the 
dual models (3) were solved in 14.3-27.7 CPU seconds on average, depending on the 
tolerance level (see Table 1). For 100 securities the optimisation times were longer 
but still about 1 minute. 


Table 1. Computational times (in seconds) for the dual CVaR model (averages of 10 instances 
with 50,000 scenarios) 


Number of securities ~£=0.05 6=0.1 B=0.2 6=0.3 B=04 £f=05 


n=50 14.3 18.7 23.6 26.4 27.4 21 
n = 100 38.1 52.1 67.9 74.8 76.7 76.0 
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The SSD consistent [14] and coherent [2] MAD model with complementary risk 
measure (445(x) = E{min{(x), Rx }}) leads to the following LP problem [18]: 


n Tr 
maximise > HjxXj— > Drdt 
gat t=1 


n 
st apels S20 targa (4) 
j=l 


n 
dt — uj —rj)xj >0,d,>0 fort=1,...,T. 
j=l 


The above LP formulation uses T + n variables and T + 1 constraints while the LP 
dual model then takes the following form: 
minimise q 
T 
s.t. gt DSi —rjr)ur > uj forj=l,...,n (5) 
t=1 
O<u <p fort=1,...,T, 


with dimensionality n x (T + 1). Hence, there is guaranteed high computational 
efficiency even for very large numbers of scenarios. Indeed, in the test problems with 
50,000 scenarios we were able to solve the dual model (5) in 25.3 seconds on average 
for 50 securities and in 77.4 seconds for 100 instruments. 

For a discrete random variable represented by its realisations y,, Gini’s mean 
difference measure I (x) = en Dive’ Max{ Vy — yx", O} py pr” is LP computable 
(when minimised). This leads us to the following GMD portfolio optimisation model 
[25]: 


T 
max — > > PrP art! 


t=1 At 
n 
S.t. Sara, xj 20 forj=1,...,n (6) 
at 
J - 


diy = Des =o ryxj, dy >O fort,’ =1,...,T7; ¢t40, 
j=l j=l 


which contains T(J — 1) nonnegative variables d,,, and T(T — 1) inequalities to 
define them. This generates a huge LP problem even for the historical data case 
where the number of scenarios is 100 or 200. Actually, as shown with the earlier 
experiments [7], the CPU time of 7 seconds on average for T = 52 has increased to 
above 30 s with T = 104 and even more than 180s for T = 156. However, similar to 
the CVaR models, variables d;, are associated with the singleton coefficient columns. 
Hence, while solving the dual instead of the original primal, the corresponding dual 
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constraints take the form of simple upper bounds (SUB) which are handled implicitly 
outside the LP matrix. For the simplest form of the feasible set (1) the dual GMD 
model takes the following form: 


min v 


T 
s.t. vo -> Eee —rjy uy =O forj=1,...,n (7) 
t=1 t/At 
O< uy <prpy fort,t’ =1,...,T;t40, 


where original portfolio variables x; are dual prices to the inequalities. The dual 
model contains T(T — 1) variables u,, but the number of constraints (excluding the 
SUB structure) n + | is proportional to the number of securities. The above dual 
formulation can be further simplified by introducing variables: 


Upy! = Upy! — Uy't fort, t/=1,...,T;t <t', (8) 


which allows us to reduce the number of variables to T(T — 1)/2 by replacing (7) 
with the following: 


min v 


T 
sto — DO ey — rv) > 0 for j=1,....n (9) 
t=1 ¢t/>t 
—PiPy < tw <prpy fort,t/=1,...,T;t<t. 


Such a dual approach may dramatically improve the LP model efficiency in the case 
of a larger number of scenarios. Actually, as shown with the earlier experiments 
[7], the above dual formulations let us to reduce the optimisation time to below 10 
seconds for T = 104 and T = 156. Nevertheless, the case of really large numbers 
of scenarios still may cause computational difficulties, due to the huge number of 
variables (T (T — 1)/2). This may require some column generation techniques [4] or 
nondifferentiable optimisation algorithms [8]. 


3 Conclusions 


The classical Markowitz model uses the variance as the risk measure, thus resulting in 
a quadratic optimisation problem. Several alternative risk measures were introduced, 
which are computationally attractive as (for discrete random variables) they result 
in solving linear programming (LP) problems. The LP solvability is very important 
for applications to real-life financial decisions where the constructed portfolios have 
to meet numerous side constraints and take into account transaction costs [10]. The 
corresponding portfolio optimisation models can be solved with general purpose LP 
solvers, like ILOG CPLEX providing a set of C++ and Java class libraries allowing 
the programmer to embed CPLEX optimisers in C++ or Java applications. 
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Unfortunately, in the case of more advanced simulation models employed for sce- 
nario generation one may get several thousands of scenarios. This may lead to the LP 
model with a huge number of variables and constraints, thus decreasing the computa- 
tional efficiency of the model. We have shown that the computational efficiency can 
then be dramatically improved with an alternative model taking advantage of the LP 
duality. In the introduced model the number of structural constraints (matrix rows) 
is proportional to the number of instruments thus not seriously affecting the simplex 
method efficiency by the number of scenarios. For the case of 50,000 scenarios, it has 
resulted in computation times below 30 seconds for 50 securities or below a minute 
for 100 instruments. Similar computational times have also been achieved for the dual 
reformulation of the MAD model. Dual reformulation applied to the GMD portfolio 
optimisation model results in a dramatic problem size reduction with the number of 
constraints equal to the number of instruments instead of the square of the number of 
scenarios. Although, the remaining high number of variables (square of the number of 
scenarios) still generates a need for further research on column-generation techniques 
or nondifferentiable optimisation algorithms for the GMD model. 


Acknowledgement. The authors are indebted to Professor Churlzu Lim from the University of 
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A pattern recognition algorithm for optimal profits in 
currency trading 


Danilo Pelusi 


Abstract. A key issue in technical analysis is to obtain good and possibly stable profits. 
Various trading rules for financial markets do exist for this task. This paper describes a pattern 
recognition algorithm to optimally match training and trading periods for technical analysis 
rules. Among the filter techniques, we use the Dual Moving Average Crossover (DMAC) rule. 
This technique is applied to hourly observations of Euro-Dollar exchange rates. The matching 
method is accomplished using ten chart patterns very popular in technical analysis. Moreover, 
in order for the results to have a statistical sense, we use the bootstrap technique. The results 
show that the algorithm proposed is a good starting point to obtain positive and stable profits. 


Key words: training sets, trading sets, technical analysis, recognition algorithm 


1 Introduction 


The choice of the best trading rules for optimal profits is one of the main problems in 
the use of technical analysis to buy financial instruments. Park and Irwin [31] described 
various types of filter rules, for instance the Dual Moving Average Crossover family, 
the Momentum group of rules and the Oscillators. For each of these filter rules we 
need to find the rule that assures the highest profit. Some good technical protocols, 
to get optimal profits in the foreign exchange market, have been found by Pelusi et 
al. [32]. 

The traders attribute to some chart patterns the property of assessing market con- 
ditions (in any financial market) and anticipating turning points. This kind of analy- 
sis started with the famous [23], which produced an important stream of literature. 
However, the popularity of this kind of analysis has been frequently challenged by 
mainstream financial economists [7,9, 22, 28-30, 35]. 

Generally, the success of a rule in actual trading is independent of the type of 
filter used. It depends on the choice of a so-called “training set", where the maximum 
profit parameters of a rule are found, and of an independent “trading set" where you 
apply the optimised filter found in the training phase. In other words, a rule which 
gives good profits in one period could cause some losses in a different period. This 
is due to substantial differences in the shapes of the asset price in the two. So, the 
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main issue for the successful application of technical trading rules is to find the best 
association of a “training sets" (TN-S) and of “trading sets" (TD-S), for the highest 
and possibly most stable profit streams. 

In this paper, we propose a synthesis of the two traditional approaches in technical 
analysis, outlined above, and use the chart pattern recognition technique for the best 
association of training and trading phases. Some works [7,9,22,23,29] contain studies 
on the information content of chart patterns. 

Our target is to investigate the existence of non-linear configurations in the hourly 
observations of the Euro-Dollar (EUR-USD), Dollar-Yen (USD-JPY) and Pound- 
Dollar (GBP-USD) exchange rates. Our pattern recognition algorithm takes into ac- 
count ten chart patterns which are traditionally analysed in the literature [7,23, 28]. 

In Section 2 we describe the algorithm. The algorithm results are shown in Section 
3, whereas Section 4 contains the conclusions. 


2 Pattern recognition algorithm 


As outlined above, we consider hourly exchange rates. The first task in the construction 
of our algorithm is the recognition that some exchange rate movements are significant 
and others are not. The most significant movements of exchange rates generate a 
specific pattern. Typically, in order to identify regularities and patterns in the time 
series of asset prices, it is necessary to extract non-linear patterns from noisy data. 
This signal extraction can be performed by the human eye, however in our algorithm 
we use a Suitable smoothing estimator. Therefore, to spot the technical patterns in the 
best way we use the kernel regression. Hardle [16] describes this smoothing method 
which permits easier analysis of the curve that describes the exchange rate. 

Generally, the various chart patterns are quite difficult to quantify analytically 
(see the technical analysis manuals [2,22,28]). However, to identify a formal way of 
detecting the appearance of a technical pattern, we have chosen the definitions shown 
in the paper of Lo et al. [23]. In these definitions, the technical patterns depend on 
extrema, which must respect certain properties. The use of kernel regression permits 
easy detection of these extrema because the curve that describes the exchange rate is 
smoothed. To identify these extrema we use a suitable method described by Omrane 
and Van Oppens [28]. 

To detect the presence of technical patterns in the best way, we use a cutoff value 
as in the work of Osler and Chang [30]. In this manner, the number of maxima 
and minima identified in the data is inversely related to the value of the cutoff. In 
other words, an increase or decrease of it generates a different series of maxima and 
minima, which will result in a different set of chart patterns. For each cutoff value, the 
algorithm searches the chart patterns HS, IHS, BTOP, BBOT, TTOP, RTOP, RBOT, 
DTOP, DBOT on the basis of their definitions [23]. Considering a single pattern at a 
time, the algorithm counts the patterns number of that type, for each cutoff value. 

To establish a similarity parameter, we define, for each jth technical pattern 
(j = 1,2, dots, 10), the coefficient that represents the similarity degree between 


A pattern recognition algorithm 255 


two different periods. Therefore, our algorithm takes the pattern number for each ith 
cutoff value and it computes 


J 


di, = ae Ny i|> i=1,2,..., 18, (1) 


i 
, 


where dj, ; is the absolute value of the difference between the number of j-type chart 
patterns of the period | and that of period 2, for each cutoff value. So, we are able to 
define the similarity coefficient S; as 

Lees 


i=l 74i,j 


Se ar Nc; 2 1 (2) 


The similarity coefficient assumes values that lie between 0 and | and ne¢, is the 
number of possible comparisons. At this step, our algorithm gives ten similarity coef- 
ficients connected to the ten technical patterns named above. The next step consists of 
computing a single value that gives informational content on the similarity between 
the periods. We refer to this value as Global Similarity (GS) and we define it as a 
weighted average 


10 
GS => 075i: (3) 
j=l 
The weights w; are defined as the ratio between the comparisons number of the jth 
pattern and the sum of comparisons number n; of all patterns (see formula 4). 
n 


ia 
J 
wjp= ae, Nt = Ney HNeg +... + Neyg (4) 
t 


10 
> vj =1. (5) 


j=l 
Moreover, the sum of weights w;, with j from 0 to 10, is equal to 1 (see formula 5). 
Computing the global similarity GS through the (3), we assign more weight to the 
similarity coefficients with greater comparisons number. 

The next step is related to the choice of time period amplitude for training and 
trading phases. For the trading set, we consider the time series of a certain year. 
Therefore, we consider the exchange rates that start from the end of the time series 
until the six preceding months. In this manner, we obtain a semester in the year 
considered. Subsequently, we create the second semester, starting from the end of the 
year minus a month, until the six preceding months. Thus, we obtain a certain number 
of semesters. Therefore, we compare these trading semesters with various semesters 
of the previous years. Subsequently, a selection of training and trading semesters pairs 
is accomplished, splitting the pairs with positive slopes and those with negative slopes. 
So, we compute the profits! of the trading semesters by considering the optimised 


! To compute the profits, we use the DMAC filter rule [32]. 
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parameters (see [32]) of the corresponding training semesters. For each semesters pair, 
we compute the GS coefficient through formula 3 and define the following quantity: 
fos =, (6) 
rp 
where 7p is the number of semester pairs with positive profit and ngs is the number 
of profitable semesters with similarity index GS that lies between two extrema. In 
particular, we consider the GS ranges: 0.0-0.1, 0.1-0.2, ..., until 0.9-1.0. So, the 
quantity fgs represents a measure of the frequency of global similarity values based 
on their memberships at the ranges named above. 

In order for the algorithm results to have a statistical sense, we need to apply our 
technique to many samples that have a trend similar to the exchange rates of the year 
considered. To solve this problem, we use a technique described by Efron [12], called 
the bootstrap method [17,34]. The key idea is to resample from the original data, 
either directly or via a fitted model,” to create various replicate data sets, from which 
the variability of the quantities of interest can be assessed without long-winded and 
error-prone analytical calculations. In this way, we create some artificial exchange 
rate series, each of which is of the same length as the original series. 


3 Experimental results 


We apply our algorithm to hourly Euro-Dollar exchange rates and consider 2006 as 
the year for trading. To create samples with trends similar to the Euro-Dollar exchange 
rate 2006, we use parametric bootstrap methods [3]. 

In the parametric bootstrap setting, we consider an unknown distribution F to 
be a member of some prescribed parametric family and obtain a discrete empirical 
distribution F;* by estimating the family parameters from the data. By generating 
an iid random sequence from the distribution F;*, we can arrive at new estimates of 
various parameters of the original distribution F’. 

The parametric methods used are based on assuming a specific model for the data. 
After estimating the model by a consistent method, the residuals are bootstrapped. In 
this way, we obtain sample sets with the same length of exchange rates as 2006. 

Table | shows the results with 10, 100, 200, 300, 400 and 500 samples. On the 
rows, we have the samples number and on the columns we have the global similarity 
frequency defined in formula (6). We can note that there are no results for the ranges 
0.5-0.6, 0.6—-0.7, until 0.9-1.0 because they give null contribution, that is there are 
no global similarity values belonging to the above-named ranges. Moreover, we can 
observe that for 10 samples, the range with highest frequency is 0.0—0.1, that is, it 
is more likely that with a similarity coefficient between 0 and 0.1 we have a positive 
profit than for the other ranges. 

The statistics results for 100 samples show that the range with the greatest fre- 
quency is 0.1—0.2. For 200, 300, 400 and 500 samples we obtain about the same value. 


2 We use a GARCH model (see [4]). 
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Table 1. Pattern recognition algorithm results 


Samples IGS, IGS SGS3 SGS4 SGSs 
10 0.4500 0.4440 0.0920 0.0150 0 

100 0.3295 0.4859 0.1633 0.0206 0.0006 
200 0.3386 0.4722 0.1671 0.0214 0.0007 
300 0.3242 0.4770 0.1778 0.0206 0.0005 
400 0.3206 0.4833 0.1774 0.0183 0.0003 
500 0.3228 0.4813 0.1768 0.0188 0.0003 


Therefore, we can infer that from 100 to 500 samples results remain stable.? It is most 
likely that with a number of samples greater than 500, the distribution will be wider. 

We also report the results related to the algorithm’s application to the real word. To 
do this, we consider the Euro-Dollar exchange rates of the year 2007. In particular, we 
choose three semester pairs that could be defined as “‘similar" by the human eye. We 
consider the second semester of 2003 the training semester and the second semester 
of 2007 the trading semester for Euro-Dollar exchange rates. These semesters are 
shown in Figure 1. 

All the figures contain two graphs: the first one shows the half-yearly trend of 
Euro-Dollar exchange rate, whereas the second one has a double-scale graph. The 
use of double scale is necessary to study the relationship between exchange rate and 
profit by the application of technical filter. The trading rule described in this practical 
example is a long-short strategy. Moreover, the technical filter used is the DMAC 
rule, which is based on the moving average definition and on Take Profit (TP) and 
Stop Loss (SL) parameters. 

The double-scale graphs have time on the x-axis, the shape of the Euro-Dollar 
exchange rate on the left y-axis and profit on the right y-axis. We underline that the 
y-axis values are pure numbers, that is without units of measurement. 

The results of Table 2 show that there is a loss of about 13 % witha GS coefficient 
of 0.61844. From Figure 1 we can note that there are substantial shape differences at 
the beginning and at the end of semesters and that the profit has a decreasing trend 
for more than half a semester. 

Figure 2 shows the second semester of 2004 (training semester) and the second 
semester of 2007 (trading semester). As can be seen in Table 2, we obtain a profit 
of about 26 % with a global similarity index of 0.66299. Observing Figure 2, we can 
see that the profit is essentially growing, except at the beginning and at the end of the 
trading semester. 

We choose as the third pair the first semester of 2006 and the second semester of 
2007 (Fig. 3). In this case, we have a loss of 14 % and a GS of 0.61634. We deduce that 
the loss is probably due to the substantial shape differences in the various semester 
sub-periods (see Fig. 3), as happened in the case of Figure 1. From Figure 3 we 


3 To perform calculations our algorithm needs about three hours of computer time for each 
sample. However, in the future we will consider sample numbers greater than 500. 
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Fig. 1. Trading phrase using the second semester of 2003 as the training period 


note that there are considerable losses at the beginning and at the end of the trading 
semester. 

Table 2 summarises the algorithm application results to the semester pairs chosen. 
From the observation of these results, we infer that there is a global similarity threshold 
which lies between 0.61844 and 0.66299. For global similarity values greater than 
this threshold we should obtain profits. 


4 Conclusions and future work 


In the technical analysis literature, some authors attribute to chart patterns the property 
of assessing market conditions and anticipating turning points. Some works develop 
and analyse the information content of chart patterns. Other papers have shown the 
importance of choosing the best trading rules for maximum and stable profits. There- 


Table 2. Profitability and similarity results of the semester pairs 


Training semester Trading semester Profit GS 
2nd 9003 2nd 2007 —0.1301 0.61844 
and 9004 2nd 2007 0.2592 0.66299 


1$* 2006 2nd 2007 —0.1367 0.61634 
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Fig. 2. Trading phase using as training period the second semester of 2004 
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Fig. 3. Trading phase using as training period the first semester of 2006 
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fore, there are some technical filters that assure the highest profit. In this way, an 
important issue is the choice of the training and trading sets. 

In this paper, we describe a pattern recognition algorithm to optimally match 
a training period and a trading period in the DMAC filter rule. The trading rule 
described is a long-short strategy investing in foreign exchange rates. We illustrate 
a practical example choosing the semester as the testing period and obtaining stable 
results. This stability is verified also for different periods, such as monthly, yearly 
and two-yearly periods. Moreover, for these temporal ranges, we realise a statistic on 
the short and long operations separately. In particular, we compute the mean and the 
standard error of the operations number, obtaining some interesting information. It 
might be convenient to also report standard indicators such as performance, volatility 
and Sharpe ratio, typical of the finance industry. 

The aim of this work is to obtain positive profits in accordance with similarity 
degrees between training and trading periods. Our method gives a similarity index 
that can be useful to establish how a training set has valuable information for a future 
trading set. The results show that the similarity index is a good starting point for this 
kind of study. Therefore, we will need to analyse how differences in shape have an 
impact on profits for global similarity indexes of comparable magnitude. 
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Nonlinear cointegration in financial time series 


Claudio Pizzi 


Abstract. In this paper, the concept of linear cointegration as introduced by Engle and 
Granger [5] is merged into the local paradigm. Adopting a local approach enables the achieve- 
ment of a local error correction model characterised by dynamic parameters. Another important 
result obtained using the local paradigm is that the mechanism that leads the dynamic system 
back to a steady state is no longer a constant: it is a function not defined a priori but estimated 
point by point. 


Key words: nonlinearity, cointegration, local polynomial model 


1 Introduction 


One of the aims of the statistical analysis of a time series is to enable the researcher 
to build a simplified representation of the data-generating process (DGP) and/or the 
relationship amongst the different phenomena under study. The methods for identi- 
fying and estimating these models are based on the assumption of the stationarity of 
the DGP. Nevertheless, this assumption is often violated when considering financial 
phenomena, for example stock price, interest rates, exchange rates and so on. The 
financial time series usually present a non-stationarity of the first order if not higher. 

In the case of the construction of a regressive model, the presence of unit roots in 
the time series means attention should be paid to the possible cointegration amongst 
the variables. 

The cointegration idea, which characterises the long-run relationship between 
two (or several) time series, can be represented by estimating a vector of parameters 
and can be used to build a dynamic model that enables both long-run relationships 
and also some transitional short-run information to be highlighted. This enables the 
representation of an error correction model that can be considered as adynamic system 
characterised by the fact that any departure from the steady state generates a short-run 
dynamic. 

The linear cointegration concept introduced by Engle and Granger [5] has been 
broadly debated in the literature and much has been published on this topic. The 
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researchers’ interest has mainly been tuned to the problem of estimating the cointe- 
gration relationship (it is worth mentioning, amongst others, Johansen [8], Saikko- 
nen [18], Stock and Watson [21] Johansen [9] and Strachan and Inder [22]) and 
building statistical tests to verify the presence of such a relationship. 

The first test suggested by Engle and Granger [5] was followed by the tests pro- 
posed by Stock and Watson [20] to identify common trends in the time series assessed. 
After them, Phillips and Ouliaris [15] developed a test based on the principal com- 
ponents method, followed by a test on regression model residuals [14]. Johansen [8] 
instead proposed a test based on the likelihood ratio. 

The idea of linear cointegration then has been extended to consider some kind of 
nonlinearity. Several research strains can be identified against this background. One 
suggests that the response mechanism to the departure from the steady state follows a 
threshold autoregressive process (see for example the work by Balke and Fomby [1]). 
With regard to the statistical tests to assess the presence of threshold cointegration, 
see for example Hansen and Byeongseon [7]. 

The second strain considers the fractional cointegration: amongst the numerous 
contributions, we would like to recall Cheung and Lai [3], Robinson and Marin- 
ucci [16], Robinson and Hualde [17] and Caporale and Gil-Alana [2]. 

Finally, Granger and Yoon [6] introduced the concept of hidden cointegration that 
envisages an asymmetrical system answer, i.e., the mechanism that guides the system 
to the steady state is only active in the presence of either positive or negative shocks, 
but not of both. Schorderet’s [19] work follows up this idea and suggests a procedure 
to verify the presence of hidden cointegration. 

From a more general standpoint, Park and Phillips [12] considered non-linear 
regression with integrated processes, while Lee et al. [11] highlighted the existence of 
a spurious nonlinear relationship. In the meantime further developments contemplated 
the equilibrium adjustment mechanism guided by a non-observable weak force, on 
which further reading is available, by Pellizzari et al. [13]. 

This work is part of the latter research strain and suggests the recourse to local 
linear models (LLM) to build a test for nonlinear cointegration. Indeed, the use of 
local models has the advantage of not requiring the a priori definition of the func- 
tional form of the cointegration relationship, enabling the construction of a dynamic 
adjustment mechanism. In other words, a different model can be considered for each 
instant (in the simplest of cases, it is linear) to guide the system towards a new equi- 
librium. The residuals of the local model can thus be employed to define a nonlinear 
cointegration test. The use of local linear models also enables the construction of a 
Local Error Correction Model (LECM) that considers a correction mechanism that 
changes in time. The paper is organised as follows. The next section introduces the 
idea of nonlinear cointegration, presenting the LECM and the unrestricted Local Er- 
ror Correction Model (uLECM). Section 3 presents an application to real data, to 
test the nonlinear cointegration assumption. The time series pairs for which the null 
hypothesis of no cointegration is rejected will be used to estimate both the LECM 
and the speed of convergence to equilibrium. The paper will end with some closing 
remarks. 
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2 Nonlinear cointegration 


Let X;, Y;: tf =1,..., T be the realisation of two integrated processes of the order 
d and consider the following relationship: 


Y; = Pot fiX: t+ ur. (1) 


If the vector 8 = (fo, f1) is not null and z, is an integrated process of the order 
b < d,then the variables X; and Y;, are linearly cointegrated and £ is the cointegration 
vector; as follows, without losing generality, it will be considered that d = 1 and 
consequently b = 0. Consider now a general dynamic relationship between Y and X: 


Y,=a+BX;+y X11 + 6Y;-1 + uy. (2) 

The parameters restriction f + y = 1 — 6 and some algebra lead to the formulation 
of the following error correction model (ECM): 

AY, =at BAX, — 21-1 + 01, (3) 


where AY; = Y; — Y¥;-1, AX; = X; — X;-, and 2;_, are residuals of the model 
estimated by equation (1) and »; is an error term that satisfies the standard properties. 

As an alternative, the unrestricted approach can be considered. The unrestricted 
error correction model can be specified as: 


AY, = a* + B° AX; + 1¥p-1 + 22X1-1 + Ov. (4) 


In a steady state there are no variations, thus AY; = AX; = 0 so that denoting 
Y* and X* the variables of the long-run relationship, the long-run solution can be 
indicated as: 


O=a*+a,Y* +22X* (5) 
and 
pS (6) 
a al 


The long-run relationship is estimated by 22/71, whereas 7, is an estimate of 
the speed of adjustment. The cointegration relationship is interpreted as a mechanism 
that arises whenever there is a departure from the steady state, engendering a new 
equilibrium. Both (3) and (6) highlight that the relationship between the variables is 
linear in nature and that all the parameters are constant with respect to time. If on 
the one hand this is a convenient simplification to better understand how the system 
works, on the other its limit consists in it restricting the correction mechanism to a 
linear and constant answer. To make the mechanism that leads back to a steady-state 
dynamic, we suggest considering the relationship between the variables of the model 
described by equation (1) in local terms. In a traditional approach to cointegration, the 
parameters of (1) are estimated just once by using all available observations and, in 
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the case of homoscedasticity, a constant weight is assigned to each piece of available 
sample data. On the contrary, the local approach replaces the model represented by 
(1) with the following equation: 


AY; = Bos + Pi X1 + 2- (7) 


The estimation of parameters in the local model is achieved through the following 
weighted regression: 


T 
; 5 
alas > [i - Boe — Bir Xi] wz, (8) 
aaa 


where w;,; indicates the weight associated to the ith sample point in the estimate of 
the function at point t. They measure the similarity between the sample points X; and 
X; and are defined as follows: 


wr,i = Ok [(Xi-; _ MGs} ) /h| Fi (9) 


where © is an aggregation operator that sums the similarities between ordered pairs 
of observations. Function k is continuous, positive and achieves its maximum in zero; 
it is also known as a kernel function. Amongst the different and most broadly used 
kernel functions aqnd the Epanechnikov kernel, with minimum variance, and the 
gaussian kernel, which will be used for the application described in the next section. 
The kernel function in (9) is dependent on parameter h, called bandwidth, which 
works as a smoother: as it increases, the weight w;,; will be higher and very similar 
to each other. Parameter / has another interpretation, i.e., to measure the model’s 
“local” nature: the smaller h is, the more the estimates will be based on few sampling 
data points, very similar to the current one. On the other hand, a higher h value means 
that many sampling data points are used by the model to achieve an estimate of the 
parameters. This paper has considered local linear models, but it is also possible to 
consider other alternative local models models such as, for example, those based on the 
Nearest Neighbours that resort to constant weights and a subset of fixed size of sample 
observations. Once the local model has been estimated, a nonlinear cointegration test 
can be established considering the model’s residuals and following the two-stage 
procedure described by Engle and Granger. Furthermore, with an adaptation from a 
global to a local paradigm, similar to the one applied to (1), equation (6) becomes: 


AY, = af + BPAX, +i Yia1 + 22X11 + Oy. (10) 


The long-run relationship and the speed of adjustment will also be dependent on 
time and no longer constant as they depend on the parameters 1; and 72, that are 
estimated locally. 

In the next section both LECM (equation 7) and uLECM (equation 10) will be 
estimated. The former to test the nonlinear cointegration hypothesis and the latter to 
estimate the speed of adjustment. 
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3 An application to a financial time series 


To verify whether there is cointegration amongst financial variables the time series 
of the adjusted closing price of the quotations of 15 US stocks were taken from 
the S&P 500 basket. For each stock the considered period went from 03.01.2007 to 
31.12.2007. Table 1 summarises the 15 stocks considered, the branch of industry they 
refer to and the results of the Phillips-Perron test performed on the price series to 
assess the presence of unit roots. 


Table 1. p-value for the Phillips—Perron test 


p-value 
Code Name Industry Stationarity Explosive 
AIG American Internat.Group Insurance 0.565 0.435 
CR Crane Company Machinery 0.623 0.377 
CSCO Cisco Systems Communications equipment 0.716 0.284 
F Ford Motor Automobiles 0.546 0.454 
GM General Motors Automobiles 0.708 0.292 
GS Goldman Sachs Group Capital markets 0.536 0.464 
JPM JPMorgan Chase & Co. Diversified financial services 0.124 0.876 
MER Merrill Lynch Capital markets 0.585 0.415 
MOT Motorola Inc. Communications equip. 0.109 0.891 
MS Morgan & Stanley Investment brokerage 0.543 0.457 
NVDA NVIDIA Corp. Semiconductor & semiconductor equip. 0.413 0.587 
PKI PerkinElmer Health care equipment & supplies 0.655 0.345 
TER Teradyne Inc. Semiconductor & semiconductor equip. 0.877 0.123 
TWx Time Warner Inc. Media 0.451 0.549 
TXN Texas Instruments Semiconductor & semiconductor equip. 0.740 0.260 


The test was performed both to assess the presence of unit roots vs stationarity 
(fourth column) and also the presence of unit roots vs explosiveness (last column). 
The null hypothesis of unit roots was accepted in all the time series considered. A 
further test, the KPSS test [10], was carried out to assess the null hypothesis that the 
time series is level or trend stationary. The results have confirmed that all the series are 
nonstationary. Considering the results from the nonstationarity tests, we proceeded 
to verify the assumption of cointegration in the time series. The acceptation of the 
latter assumption is especially interesting: this result can be interpreted in terms of the 
mechanisms that affect the quotation of the stocks considered. More specifically, the 
shocks that perturb the quotations of a stock imply departure from the system’s steady 
state, thus inducing variations that depend on the extent of the shock and the estimable 
speed of convergence towards the new equilibrium. From another standpoint, the 
presence/absence of cointegration between two stocks may become important when 
contemplating the implementation of a trading strategy. Recording the variations in a 
variable (quotation of a stock) enables prediction of the “balancing” response provided 
by some variables or the purely casual responses of others. Considering the definition 
of cointegration introduced in the previous section and taking into account the 15 
shares contemplated in this application, there are numerous applicable cointegration 
tests, as each possible m-upla of variables can be considered with m = 2,..., 15. 
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In this paper the test of hypothesis of cointegration has been restricted to analysing 
the relationships between all the different pairs of time series. The Phillips Ouliaris 
test [15] was performed on each pair, the results of which are summarised in Table 2. 
The table highlights in bold the p-values lower than 0.05 that show the pairs of stocks 
for which the hypothesis of no cointegration was rejected. The test was performed 
resorting to R software. Note that the software outputs 0.15 for any real p-value 
greater than 0.15. Hence the values 0.15 in the Table 2 mean that the null hypothesis 
of no cointegration is accepted with a p-value > 0.15. 


Table 2. p-value of the Phillips Ouliaris test for the null hypothesis that the time series are not 
cointegrated 


AIG CR CSCO F  GM_ GS JPM MER MOT MS NVDA PKI TER TWX 


CR 0.15 

CSCO 0.15. 0.15 

F 0.15 0.15 0.15 

GM 0.15 0.15 O15 0.15 

GS 0.15 0.15 0.15 O15 0.15 

JPM 0.15 0.15 O15 0.15 O15 0.15 

MER 0.15 0.15 0.15 0.15 O15 O15 0.15 

MOT 0.06 0.15 0.15 0.15 O15 O15 0.15 0.07 

MS 0.15 0.15 O15 O15 O15 O15 0.15 O15 0.15 

NVDA 0.01 0.15 0.15 0.15 O15 0.15 O15 0.15 0.15 0.05 

PKI 0.15 0.15 O15 0.15 O15 O15 0.01 0.06 0.15 0.15 0.15 

TER 0.15 0.04 0.15 0.15 O15 O15 O15 O15 O15 0.15 0.15 0.15 

TWX 0.05 0.15 0.04 0.08 0.15 O15 0.15 O15 0.15 0.14 0.07 0.15 0.15 
TXN 0.11 0.15 0.15 O15 0.15 O15 0.15 0.04 0.01 O11 0.10 0.10 O15 0.15 


Table 3. p-value for the nonlinear cointegration test for the null hypothesis that the time series 
are not cointegrated 


AIG CR CSCO F  GM_ GS JPM MER MOT MS NVDA PKI TER TWX 


CR 0.19 

CSCO 0.65 0.00 

F 0.93 0.97 0.99 

GM 0.95 0.97 0.81 0.59 

GS 0.03 0.31 0.06 0.00 0.20 


JPM 0.27 0.13 0.98 0.00 0.99 0.00 

MER 0.23 0.14 1.00 0.99 0.99 0.98 0.28 

MOT 0.08 0.53 0.14 0.00 0.99 0.01 0.02 0.01 

MS 0.00 0.00 0.16 0.94 1.00 0.94 0.59 0.86 0.46 

NVDA 0.13 0.51 1.00 0.00 1.00 0.14 0.09 0.77 0.89 0.00 

PKI 0.02 0.01 0.00 0.00 1.00 0.00 0.00 0.00 0.13 0.89 0.01 

TER 0.45 0.01 0.00 0.99 0.99 0.99 0.00 0.05 0.00 0.99 0.00 0.00 

TWX 0.42 0.06 1.00 1.00 1.00 0.99 0.00 0.42 0.01 1.00 1.00 0.02 0.00 
TXN 041 0.00 0.99 100 1.00 0.99 0.00 0.79 0.02 1.00 0.01 0.00 0.00 0.01 


The figures in bold show that only 7 pairs, out of the total 105 combinations, of 
time series are linearly cointegrated, highlighting for the majority of cases the lack of a 
long-run relationship and adjustment mechanism. To assess the presence of nonlinear 


Nonlinear cointegration in financial time series 269 


cointegration, as presented in the previous section, we adapted the two-stage procedure 
suggested by Engle and Granger to the nonlinear framework. In the first stage, the 
local linear model was initially estimated amongst the series under investigation; then 
the stationarity was tested using the residuals of the estimated linear local model. If 
the null hypothesis of nonstationarity was discarded, the second stage of the procedure 
was conducted: it consisted in verifying the cointegration hypothesis by performing 
a second regression as in (3). The results of the two-stage procedure are shown in 3. 
They highlight that the use of local linear models has enabled the identification of 
nonlinear cointegration relationships among 40 binary time series combinations. This 
confirms the initial assumption, i.e., that the time series lacking linear cointegration 
in fact present a nonlinear relationship. 

For the time series that presented nonlinear cointegration, an unrestricted local 
error correction model was also estimated to obtain both the long-run dynamic re- 
lationship and the function of the speed of adjustment. Below, for brevity, only one 
case is presented. The considered period went from 01/07/2000 to 31/12/2002. has to 
be interpreted as acceptation of the null hypothesis of the absence of co-integration 
with a p-value > 0.15. 


Fig. 1. Time series of stocks price 


eee 
Vs 


Fig. 2. Speed of adjustment function 
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Figure 1 shows the time series of the price of the stocks considered, i.e., Ford 
Motor (F) and Motorola Inc. (MOT). The speed of adjustment is depicted in Figure 
2 (points); its behaviour is very rough and to smooth it we estimate the function of 
speed using the local polynomial regression (LOESS procedure) [4] (line). 

It is worth mentioning that the velocity increases when strong market shocks 
perturb one of the time series disturbing the system from its steady state. As the 
adjustment mechanism drives the system towards the new equilibrium, the speed of 
adjustment tends to diminish. 


4 Conclusion 


The analysis of the time series of the 15 shares has enabled us to highlight that the 
relationships that bind two stocks in the long run do not always follow a linear error 
correction structure. To overcome this limit, we have suggested a local error correction 
model that enables the investigation of the presence of nonlinear cointegration. By 
applying this local model, it has been shown that, out of all those analysed, several 
pairs of stocks are bound by a nonlinear cointegration relationship. Furthermore, the 
LECM, reformulated in terms of an unrestricted local error correction model, has 
also enabled the determination of the correction speed and the long-run relationship 
between variables as a function of time, enabling the consideration of a dynamic 
cointegration relationship. 
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Optimal dynamic asset allocation in a non—Gaussian 
world 


Gianni Pola 


Abstract. Asset Allocation deals with how to combine securities in order to maximize the 
investor’s gain. We consider the Optimal Asset Allocation problem in a multi-period investment 
setting: the optimal portfolio allocation is synthesised to maximise the joint probability of the 
portfolio fulfilling some target returns requirements. The model does not assume any particular 
distribution on asset returns, thus providing an appropriate framework for a non—Gaussian 
environment. A numerical study clearly illustrates that an optimal total-return fund manager is 
contrarian to the market. 


Key words: asset allocation, portfolio management, multi-period investment, optimal control, 
dynamic programming 


1 Introduction 


In the finance industry, portfolio allocations are usually achieved by an optimiza- 
tion process. Standard approaches for Optimal Asset Allocation are based on the 
Markowitz model [15]. According to this approach, return stochastic dynamics are 
mainly driven by the first two moments, and asymmetry and fat-tails effects are as- 
sumed to be negligible. The model does not behave very well when dealing with 
non—Gaussian-shaped asset classes, like Hedge Funds, Emerging markets and Com- 
modities. Indeed it has been shown that sometimes minimizing the second order 
moment leads to an increase in kurtosis and a decrease in skewness, thus increasing 
the probability of extreme negative events [3, 10,22]. Many works have appeared 
recently in the literature that attempt to overcome these problems: these approaches 
were based on an optimization process with respect to a cost function that is sensitive 
to higher-order moments [2, 11, 12], or on a generalisation of the Sharpe [21] and 
Lintner [14] CAPM model [13, 16]. 

The second aspect of the Markowitz model is that it is static in nature. It permits the 
investor to make a one-shot allocation to a given time horizon: portfolio re-balancing 
during the investment lifetime is not faced. Dynamic Asset Allocation models address 
the portfolio optimisation problem in multi-period settings [4,7, 17,20, 23]. 


M. Corazza et al. (eds.), Mathematical and Statistical Methods for Actuarial Sciences and Finance 
© Springer-Verlag Italia 2010 


274 Gianni Pola 


In this paper we consider the Optimal Dynamic Asset Allocation (ODAA) problem 
from a Control System Theory perspective. We will show that the ODAA problem 
can be reformulated as a suitable optimal control problem. Given a sequence of 
target sets, which represent the portfolio specifications, an optimal portfolio allocation 
strategy is synthesized by maximizing the probability of fulfilling the target sets 
requirements. The proposed optimal control problem has been solved by using a 
Dynamic Programming [6] approach; in particular, by using recent results on the 
Stochastic Invariance Problem, established in [1, 18]. The proposed approach does 
not assume any particular distribution on the stochastic random variables involved and 
therefore provides an appropriate framework for non-Gaussian settings. Moreover the 
model does not assume stationarity in the stochastic returns dynamics. The optimal 
solution is given in a closed algorithmic form. 

We applied the formalism to a case study: a 2-year trade investing in the US 
market. The objective of the strategy is to beat a fixed target return at the end of 
the investment horizon. This study shows markedly that an (optimal) total return 
fund manager should adopt a contrarian strategy: the optimal solution requires an 
increase in risky exposure in the presence of market drawdowns and a reduction in the 
bull market. Indeed the strategy is a concave dynamic strategy, thus working pretty 
well in oscillating markets. We contrast the ODAA model to a convex strategy: the 
Constant-Proportional-Portfolio-Insurance (CPPI) model. 

Preliminary results on the ODAA problem can be found in [19]. 

The paper is organised as follows. In Section 2 we give the formal statement of 
the model and show the optimal solution. Section 3 reports a case study. Section 4 
contains some final remarks. 


2 The model: formal statement and optimal solution 


Consider an investment universe made of m asset-classes. Given k € N, define the 
vector: is 
we = [ we(1) we(2) +++ we) |7 ER”, 


where the entries are the returns at time k. Let 
ux = [ug(1) ug(2) ... ug(m) |? eR” 


be the portfolio allocation at time k ¢ NV’. Usually some constraints are imposed on 
ux in the investment process: we assume that the portfolio uz is constrained to be 
in a given set U;, C R”. The portfolio time evolution is governed by the following 
stochastic dynamical control system: 


Ket = Xe + uf wes), KEN, (1) 


where: 


e x, € X = Ris the state, representing the portfolio value at time k; 
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e uz € UR C R” is the control input, representing the portfolio allocation at time k; 
and 
e wk €R" is arandom vector describing the asset classes’ returns at time k € NV. 


Equation (1) describes the time evolution of the portfolio value: De wWk+1 quantifies the 
percentage return of the portfolio allocation ux at time k in the time interval [k, k + 1) 
due to market performances w x41. 

Let (Q, F, P) be the probability space associated with the stochastic system in 
(1). Portfolio value x, at time k = 0 is assumed to be known and set to x9 = 1. The 
mathematical model in (1) is characterised by no specific distribution on the asset 
classes’ returns. We model asset classes’ returns by means of Mixtures of Multivari- 
ate Gaussian Models (MMGMs), which provide accurate modelling of non—Gaussian 
distributions while being computationally simple to be implemented for practical is- 
sues!, We recall that a random vector Y is said to be distributed according toa MMGM 
if its probability density function py can be expressed as the convex combination of 
probability density functions py, of some multivariate Gaussian random variables Y;, 
ie., 


N N 
py(y) = do Aipy,O), 4 € 10,1, >oai=1. 

i=1 i=1 
Some further constraints are usually imposed on coefficients 1; so that the resulting 
random variable Y is well behaved, by requiring, for example, semi-definiteness of the 
covariance matrix and unimodality in the marginal distribution. The interested reader 
can refer to [8] for a comprehensive exposition of the main properties of MMGMs. 

The class of control inputs that we consider in this work is the one of Markov 
policies [6]. Given a finite time horizon N € NV, a Markov policy is defined by the 
sequence 
c= {uo, Uj,.--,5 un—1} 


of measurable maps ux, : X — Ux. Denote by U% the set of measurable maps ux : 
X — Uy and by IIy the collection of Markov policies. For further purposes let 
Te = (ie eis es tNaiy: 

Let us consider a finite time horizon N which represents the lifetime of the consid- 
ered investment. Our approach in the portfolio construction deals with how to select 
a Markov policy z in order to fulfill some specifications on the portfolio value x; at 
times k = 1,..., N. The specifications are defined by means of a sequence of target 
sets {21, L2,..., Uw} with X; C X. The investor wishes to have a portfolio value x; 
at time k that is in Xx. Typical target sets X; are of the form 2% = [x,, +o00[ and 
aim to achieve a performance that is downside bounded by x, € R. This formulation 
of specifications allows the investor to have a portfolio evolution control during its 
lifetime, since target sets X, depend on time k. 

The portfolio construction problem is then formalized as follows: 


' We stress that MMGM modelling is only one of the possible choices: formal results below 
hold without any assumptions on the return stochastic dynamics. 
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Problem 1. (Optimal Dynamic Asset Allocation (ODAA)) Given a finite time horizon 
N € N and a sequence of target sets 


{Z1, X2,.... Un}, (2) 


where 2, are Borel subsets of X, find the optimal Markov policy z that maximizes 
the joint probability quantity 


P({@ € Q: x9 € Xo, x1 € Ly,...,xN € Uy}). (3) 


The ODAA problem can be solved by using a dynamic programming approach [6] and 
in particular by resorting to recent results on stochastic reachability (see e.g., [18]). 

Since the solution of Problem 1 can be obtained by a direct application of the 
results in the work of [1, 18], in the following we only report the basic facts which 
lead to the synthesis of the optimal portfolio allocation. Given x € X andu € R”, 
denote by Pp f(x,u,w,) the probability density function of random variable: 


f(x, u, Wee) = x(L + u" wet), (4) 


associated with the dynamics of the system in (1). Given the sequence of target sets 
in (2) and a Markov policy z, we introduce the following cost function V, which 
associates a real number V(k, x, z*) € [0, 1] to a triple (k, x, 2") by: 


Tz, (%), ifk=N, 


Vik, x, x* — 
( ) fl Vie +1, 2, 0°" )pe@)dz, ifk=N—1,N—2,...,0, 
Uit1 
(5) 


where /s,, (x) is the indicator function of the Borel set Xy (i.e. /y, (x) = lifx € Ly 
and Jy, (x) = 0, otherwise) and pf stands for pf(x,ug,wg41)- Results in [18] show 
that cost function V is related to the probability quantity in (3) as follows: 


P({@ €Q: x9 € Lo, x1 € L4,...,xn € UNn}) = V(O,7 x0, 2). 


Hence the ODAA problem can be reformulated, as follows: 


Problem 2. (Optimal Dynamic Asset Allocation) Given a finite time horizon N e NV 
and the sequence of target sets in (2), compute: 


a* =arg sup V(0, x0, z). 


zéelly 


The above formulation of the ODAA problem is an intermediate step towards the 
solution of the optimal control problem under study which can now be reported 
hereafter. 
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Theorem 1. The optimal value of the ODAA Problem is equal to [18] 
p* = Jo(xo), 
where Jo(x) is given by the last step of the following algorithm, 


In (x) = InyQ), 


(6) 
i ap i hii @ pisses SOde BSN ENED oO. 
Zk+1 


up Ug 


The algorithm proceeds as follows. Suppose that the time horizon of our investment 
is N. First, the optimisation algorithm (6) is solved for k = N — 1. This optimisation 
problem can be automatically solved by using a wealth of optimisation packages 
in many computer programs, for example, MATLAB, Mathematica and Maple. The 
solution to (6) provides the optimal strategy u(x) to be applied to the investment 
when the value of the portfolio is x at time k. Once the optimization problem (6) is 
solved for k = N — 1, function Jyy— (x) is also known. Hence, on the basis of the 
knowledge of function Jyy—1(x), one can proceed one step backwards and solve the 
optimisation problem (6) at step 7 = N —2. This algorithmic optimisation terminates 
when j = 0. The outcome of this algorithm is precisely the optimal control strategy 
that solves (3), as formally stated in Theorem 1. 


3 Case study: a total return portfolio in the US market 


In this section we apply the proposed methodology to the synthesis of a total return 
product. The investment’s universe consists of 3 asset classes: the money market, the 
US bond market and the US equity market. Details on the indices used in the analysis 
are reported below: 


Label Asset Index 

C Money market US Generic T-bills 3 months 

B US bond JP Morgan US Government Bond All Maturity 
E US equity S&P500 


Time series are in local currency and weekly based from January Ist 1988 to December 
28th 2007. The total return product consists of a 2-year trade. The investor objective is 
to beat a target return of 7% (annualised value) at maturity; his budget risk corresponds 
to 7% (ex ante) monthly Value at Risk at 99% (VaR99m) confidence level.” 

The portfolio allocation will be synthesized applying the results presented in the 
previous section. We first consider an ODAA problem with a quarter rebalancing 
(N = 8). 


2 This budget risk corresponds to an ex ante (annual) volatility of 10.42%. 
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Table 1. Probabilistic model assumptions 


Cc B E 
return (ann) 3.24% 5.46% 10.62% 
vol (ann) 0% 4.45% 14.77% 
skewness 0 —0.46 —0.34 
kurtosis 3 4.25 5.51 
corr to C 1 0 0 
corr to B 0 1 0.0342 
corr to E 0 0.0342 1 


The first step consists in building up a probabilistic model that describes the 
asset classes’ return dynamics. Risk figures and expected returns? are reported in 
Table |. Asset classes present significant deviations to the Gaussian nature (Jarque— 
Bera test; 99% confidence level): bond and equity markets are leptokurtic and negative 
skewed. We assume stationarity in the dynamics of the returns distribution. This 
market scenario has been modelled with a 2-states Mixture of Multivariate Gaussian 
Models (MMGM), as detailed in the Appendix. The proposed MMGM modelling 
exactly fits up to the fourth-order the asset-classes’ performance and risk figures, and 
up to the second order the correlation pattern. 

The investment requirements are translated into the model as follows. The opti- 
misation criterion consists in maximising the probability P(xg > 1.077), xg being 
the portfolio value at the end of the second year. The target sets &s formalisation is 
given below: 


Yo = {1}, Le =[0,+00), Vk =1,2,...,7, Lg = [1.077, +00). (7) 


More precisely, the optimisation problem consists in determining the (optimal) dy- 
namic allocation grids uz (k = 0, 1, ..., 7) in order to maximise the joint probability 
P(x € X1,...,Xg € Xg) subjected to the Value-at-Risk budget constraint. By apply- 
ing Theorem | we obtain the optimal control strategy that is illustrated in Figure 1. 
(Budget and long-only constraints have been included in the optimisation process.) 
The allocation at the beginning of the investment (see Figure 1, upper-left panel) 
is 46% Bond and 54% Equity market. After the first quarter, the fund manager re- 
vises the portfolio allocation (see Figure 1, upper-right panel). Abscissas report the 
portfolio value x; at time k = 1. For each portfolio realisation x;, the map gives 
the corresponding portfolio allocation. As the portfolio strategy delivers higher and 
higher performance in the first quarter, the optimal rebalancing requires a reduction 
in the risky exposure. If x1 reaches a value around 1.0832, a 100% cash allocation 
guarantees the target objective will be reached at maturity. Conversely, a portfolio 


3 In the present work we do not face the problem of returns and risk-figures forecasting. 
Volatility, skewness, kurtosis and the correlation pattern have been estimated by taking the 
historical average. Expected returns have been derived by assuming a constant Sharpe ratio 
(0.50), and a cash level given by the US Generic T-bills 3 months in December 3 1st 2007. 
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Fig. 1. ODAA optimal solution 


value of x; = 0.9982 moves the optimal strategy to the maximum allowed risky 
exposure (i.e. 7% VaR99m). 

Maps for k = 2, 3, 4, 5, 6, 7 exhibit similar characteristics as fork = 1. The main 
difference is that as k increases the portfolio rebalancing gets sharper and sharper. 

The ODAA maximal probability p* is 68.40%. In order to make acomparison with 
more standard approaches, we run the same exercise for a Markowitz constant-mix 
investor: in this case the optimal solution requires the full budget-risk to be invested, 
with a maximal probability of 61.90%. The ODAA model gets more and more efficient 
as the rebalancing frequency increases. Table 2 reports the maximal probabilities for 
rebalancing frequency of three months (NV = 8), one month (NV = 24), two weeks 
(N = 52) and one week (N = 104). In fact, this result is a direct consequence of 
the Dynamic Programming approach pursued in this paper. It is worth emphasising 
that the Markowitz constant-mix approach does not produce similar results: in fact 
the probability is rather insensitive to the rebalancing frequency. 

The allocation grids reported in Figure | clearly show that an (optimal) total return 
fund manager should adopt a contrarian rebalancing policy [9]: the investor should 
increase the risky exposure in the presence of market drawdowns and reduce it in 
case of positive performance. The contrarian attitude of the model is a peculiarity of 
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Table 2. Maximal probabilities p* 


3m 1m 2w lw 
Probability 68.40% 72.50% 76.28% 77.76% 


concave strategies.’ This feature makes the model particularly appealing in oscillating 
markets. 

We conclude this section by making a comparison with a convex strategy. We 
run a Constant—Proportional—Portfolio—Insurance (CPPI) model [5] with a 2-years 
horizon: the model has been designed to protect the capital at maturity. We assume 
that the risky basket is composed by the equity market (multiplier has been set to 6) 
and trading is weekly based. In order to make a comparison, we assume the same 
Value-at-Risk budget risk as in the previous case. 

Table 3 offers a comparison between the two approaches. (The ODAA results 
refers to a weekly based strategy; N = 104.) The CPPI strategy works pretty well 
to protect the capital (99.60%), but it presents a lower probability of achieving large 
returns. (The probability of beating a 7% target return at maturity is 33.64%.) Distri- 
bution of performance at maturity is positive skewed and platykurtic, thus revealing a 
very stable strategy. Conversely, the ODAA strategy presents a higher probability of 
delivering its target return (77.76%), but a lower probability of protecting the capital. 
ODAA performance distribution is negative skewed and leptokurtic. Higher-order 
risk is paid off by the the large probability of achieving more ambitious returns. 

The applications presented in this section should be considered for illustrating the 
methodology. The views expressed in this work are those of the author and do not 
necessarily correspond to those of Crédit Agricole Asset Management. 


Table 3. Comparison between the ODAA and CPPI strategy 


ODAA CPPI 
Mean perf N (ann) 5.68% 6.15 % 
Median perf N (ann) 7.03% 3.64% 
Skewness perf N —2.80 1.35 
Kurtosis perf N 11.08 4.44 
Vol (ann) 2.70% 3.05% 
Sharpe (ann) 1.51 0.44 
Prob 0% 91.40% 99.60% 
Prob cash 85.58% 53.21% 
Prob 7% 77.16% 33.64% 


4 The exposure diagram reports on the X—axis the portfolio value and on the Y—axis the risky 
exposure. Concave (resp. convex) strategies are characterised by a concave (resp. convex) 
exposure diagram. 
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4 Conclusions 


In this paper we considered the Optimal Dynamic Asset Allocation problem. Given 
a sequence of target sets that the investor would like his portfolio to stay within, the 
optimal strategy is synthesised in order to maximise the joint probability of fulfilling 
the investment requirements. The approach does not assume any specific distributions 
for the asset classes’ stochastic dynamics, thus being particularly appealing to treat 
non-Gaussian asset classes. The proposed optimal control problem has been solved by 
leveraging results on stochastic invariance. The optimal solution exhibits a contrarian 
attitude, thus performing very well in oscillating markets. 


Acknowledgement. The author would like to thank Giordano Pola (University of L’ Aquila, 
Center of Excellence DEWS, Italy), Roberto Dopudi and Sylvie de Laguiche (Crédit Agricole 
Asset Management) for stimulating discussions on the topic of this paper. 


Appendix: Markets MMGM modeling 


Asset classes used in the case study present significant deviation to gaussianity. This 
market scenario has been modelled by a 2-state MMGM. States | and 2 are charac- 
terised by the following univariate statistics:> 

{u1@}; = [0.000611; 0.001373; 0.002340], 

{o1 (i)}; = [0.000069; 0.005666; 0.019121], 

{u2()}; = [0.000683; —0.016109; —0.017507], 

{o2(i)}; = [0.000062; 0.006168; 0.052513], 


and correlation matrix:® 


C B E 
corr to C 1 0.0633 0.0207 
corr to B 0.0633 1 —0.0236 
corr to E 0.0207 —0.0236 1 


Transition probabilities are uniform and the unconditional probability of State 1 is 
98%. The above MMGM model correctly represents the univariate statistics of the 
asset classes up to the fourth order (as detailed in Table 1) and up to the second order 
concerning the correlation patterns. 


5 Ls (i) and os (i) indicate (resp.) the performance and volatility of asset i in the state s. (Values 
are weekly based.) 
© We assume the same correlation matrix for the above gaussian models. 
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Fair costs of guaranteed minimum death benefit 
contracts 


Francois Quittard-Pinon and Rivo Randrianarivony 


Abstract. The authors offer a new perspective on the domain of guaranteed minimum death 
benefit contracts. These products have the particular feature of offering investors a guaranteed 
capital upon death. A complete methodology based on the generalised Fourier transform is 
proposed to investigate the impacts of jumps and stochastic interest rates. This paper thus 
extends Milevsky and Posner (2001). 


Key words: life insurance contracts, variable annuities, guaranteed minimum death benefit, 
stochastic interest rates, jump diffusion models, mortality models 


1 Introduction 


The contract analysed in this article is a Guaranteed Minimum Death Benefit contract 
(GMDB), which is a life insurance contract pertaining to the class of variable annuities 
(VAs). For an introduction to this subject, see Hardy [4] and Bauer, Kling and Russ [2]. 
The provided guaranty, only in effect upon death, is paid by continuously deducting 
small amounts from the policyholder’s subaccount. It is shown in this chapter how 
these fees can be endogenously determined. Milevsky and Posner [8] found these fees 
overpriced by insurance companies with respect to their model fair price. To answer 
this overpricing puzzle, the effects of jumps in financial prices, stochastic interest 
rates and mortality are considered. For this purpose, a new model is proposed which 
generalises Milevsky and Posner [8]. 


2 General framework and main notations 


2.1 Financial risk and mortality 


Financial risk is related to market risk firstly because the policyholder’s account is 
linked to a financial asset or an index, and secondly via interest rates. We denote by r 
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the stochastic process modelling the instantaneous risk-free rate. The discount factor 
is thus given by: 
-—f “ry ds 
Op =e 0S (1) 


The policyholder’s account value is modelled by the stochastic process S. In that 
model, stands for the fees associated with the Mortality and Expense (M&E) risk 
charge. 


The future lifetime of a policyholder aged x is the r.v. T,,. For an individual aged 
x, the probability of death before time t > 0 is 


Pep Si= exp(- f Aa + sds), (2) 
0 


where 4 denotes the force of mortality. As usual, F, and f, denote respectively the 
c.d.f. and the p.d.f. of the rv. 7). To ease notation, we generally omit the x from 
the future lifetime and write T when no confusion is possible. We assume stochastic 
independence between mortality and financial risks. 


2.2 Contract payoff 


The insurer promises to pay upon the policyholder’s death the contractual amount 
max{Sge% Te Sr}, where g is a guaranteed rate, So is the insured initial investment 
and S'r is the subaccount value at time of death x + T. We can generalise this payoff 
further by considering a contractual expiry date x + ©. The contract only provides a 
guarantee on death. If the insured is otherwise still alive after time © passes, she will 
receive the account value by that time. For the sake of simplicity, we keep the first 
formulation, and we note that: 


+ 
max{Soe8?, Sr} = Sp + [Soes” = Sr] (3) 


Written in this way, the contract appears as a long position on the policyholder account 
plus a long position on a put option written on the insured account. Two remarks are in 
order: firstly, the policyholder has the same amount as if she invested in the financial 
market (kept aside the fees), but has the insurance to get more, due to the put option. 
Secondly, because T is a r.v., her option is not a vanilla one but an option whose 
exercise date is itself random (the policyholder’s death). 

The other difference with the option analogy lies in the fact that in this case there is 
no upfront payment. In this contract, the investor pays the guarantee by installments. 
The paid fees constitute the so-called M&E risk charges. We assume they are con- 
tinuously deducted from the policyholder account at the contractual proportional rate 
€. More precisely, we consider that in the time interval (t, t + dt), the life insurance 
company receives fS; dt as instantaneous earnings. We denote by F' the cumulative 
discounted fees. F; is the discounted accumulated fees up to time t, which can be a 
stopping time for the subaccount price process S. The contract can also be designed 
in order to cap the guaranteed rate g; in the VA literature, this is known as capping 
the rising floor. 
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2.3 Main Equations 
Under a chosen risk-neutral measure Q, the GMDB option fair price is thus 
G(O) = Eo[ér(Soe8" — Sr)*], 


and upon conditioning on the insured future lifetime, 
G(0) = Eo| Eo[sr(Soe*? = Sr)* 17 = #]], (4) 


which — taking into account a contractual expiry date — gives: 


(2) 
GQ) = i: fc) Eg[dr Soe’? — Sr)*|T = t] dt. (5) 
0 


If Fr denotes the discounted value of all fees collected up to time 7, the fair value 
of the M&E charges can be written 


ME(€) = Eol|Frl, 
which after conditioning also gives: 
ME(€) = Eo[EolFriT = t]]. (6) 


Because the protection is only triggered by the policyholder’s death, the endoge- 
nous equilibrium price of the fees is the solution in ¢, if any, of the following equation 


G(é) = ME(C). (7) 


This is the key equation of this article. To solve it we have to define the investor 
account dynamics, make assumptions on the process S, and, of course, on mortality. 


3 Pricing model 


The zero-coupon bond is assumed to obey the following stochastic differential equa- 
tion (SDE) in the risk-neutral universe: 


dP(t,T) 


PEI T) dW, (8) 


where P(t, T) is the price at time t of a zero-coupon bond maturing at time T, 7; is 
the instantaneous risk-free rate, o p(t, T) describes the volatility structure and W is a 
standard Brownian motion. 

In order to take into account a dependency between the subaccount and the interest 
rates, we suggest the introduction of a correlation between the diffusive part of the 
subaccount process and the zero-coupon bond dynamics. The underlying account 
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price process S is supposed to behave according to the following SDE under the 
chosen equivalent pricing measure Q: 


dS; 

Sp- 

Again, 7; is the instantaneous interest rate, £ represents the fixed proportional insur- 

ance risk charge, o is the asset’s volatility, p is the correlation between the asset and 

the interest rate, W and Z are two independent standard Brownian motions, and the 

last part takes into account the jumps. Nisa compensated Poisson process with in- 

tensity 1, while Y, ar.v. independent from the former processes, represents the price 
change after a jump. The jump size is defined by J = In(Y). 

Let us emphasise here that the non-drift part M, defined by dM; = po dW; + 

o/1— p?dZ,+(Y¥ —1)dN,, isa martingale in the considered risk-neutral universe. 


= (r, — 0) dt + po dW, + o\/1 — p2 dZ, + (Y — 1) dN,. (9) 


3.1 Modelling stochastic interest rates and subaccount jumps 


Denoting by N; the Poisson process with intensity 2 and applying Ito’s lemma, the 
dynamics of S writes as: 


Jors ds (+407 tix) t+po W,+or./1—p "23 n(n) 
S; = Soe (10) 


where x = E(Y — 1). The zero-coupon bond price obeys the following equation: 
P(t, T) = P(0, T) elo op(s,T)dW.—4 5 Fa(s,T) ds+ fors ds 


The subaccount dynamics can be written as: 


Nt 
é So (04+ 402428) t+4 to op (s,t) ds+ fitpo—op(s,t)ldW.+o/1—p? 24>, In(();) 
= S77 oe i=l : 
‘ PO1) 
Let us introduce the T-forward measure Q7 defined by 
a 22 GE). = elo TP(S.T)AWs—5 fo ops, T)ds (11) 
Fi PO, T) 


where 0; is the discount factor defined in (1). Girsanov’s theorem states that the 
stochastic process W! defined by wi = W;- if op(s, T) ds,isastandard Brownian 
motion under Q7. Hence, the subaccount price process can be derived under the T- 


forward measure: 
So X; 


PO,” 


(12) 


t= 


where X is the process defined by 


=—(€+ 30° +AK)t +f (ors, T)(po — op(s, t)) + sop(s, 1))ds 
t (13) 
+ [ (v0 - ores, AAW! + o,/1— p2 PEs + > inl (Y)i). 


i=1 
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A lengthy calculation shows that the characteristic exponent é7(u) of X 7 under 
the T-forward measure, defined by Eg, [ein* P= eft) | writes: 


gr(u) = —iueT — #33 — £24 AT (su) — b)(-i)), (14) 


where ¢,(u) denotes the characteristic function of the i.1.d. rv.’s Jj = In((Y )i) and 
T 
se = i (o? —2poop(s,T)+ an(s, T))ds. (15) 
0 


3.2 Present value of fees 


Using the definition of F; and (6), it can be shown that: 


ME(é)=1- [ etna, (16) 
0 


where f; is the p.d.f. of the rv. 7. A very interesting fact is that only the mortality 
model plays a role in the computation of the present value of fees as seen in (16). 
Taking into account the time to contract expiry date ©, we have: 


e 
me =1- | ef. (t)dt — (1— F,(®))e""®. (17) 
0 


3.3 Mortality models 


Two mortality models are taken into account, namely the Gompertz model and the 
Makeham model. Another approach could be to use the Lee-Carter model, or introduce 
amortality hazard rate as in Ballotta and Haberman (2006). In the case of the Gompertz 
mortality model, the force of mortality at age x follows 


A(x) = B.C*, (18) 


x-—m 


where B > OandC > 1. Itcan also be written as A(x) = rexp( 5 : wherem > 0 


is the modal value of the Gompertz distribution and b > 0 is a dispersion parameter. 
Starting from (2), it can be shown that the present value of fees! in the case of a 
Gompertz-type mortality model amounts to: 


ME) =1— ee" (1 — £b, bA(x)) — T(1 — €, bAGw)e*) | 
(19) 
= pbuay(I-e?) 0 


' It is to be noted that formula (19) corrects typos in Milevsky and Posner’s (2001) original 
article. 
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[o.@) 

where (a,x) = fe’ t¢—!dt is the upper incomplete gamma function where a must 
x 

be positive. This condition entails an upper limit on the possible value of the insurance 


risk charge ¢: 


1 
2 2 
f<; (20) 


The Makeham mortality model adds an age-independent component to the Gom- 
pertz force of mortality (18) as follows: 


A(x) =A+B.C*, (21) 


where B > 0,C > land A> -—B. 
In this case, a numerical quadrature was used to compute the M&E fees. 


3.4 Valuation of the embedded GMDB option 


The valuation of this embedded GMDB option is done in two steps: 

First, taking the conditional expectation given the policyholder’s remaining life- 
time, the option is valued in the context of a jump diffusion process with stochastic 
interest rates, with the assumption that the financial asset in the investor subaccount 
is correlated to the interest rates. 

More precisely, let us recall the embedded GMDB option fair price, as can be 
seen in (4): 

G(o) = Eo| Eo[dr(Soes™ SSeS il]. 


Using the zero-coupon bond of maturity T as a new numéraire, the inner expectation 
Ty can be rewritten as: 


Ir = Eo[6r(Soe8’ — Sr)*] = PO, T)Eo,[(K — Sr)*]. 


Then this expectation is computed using an adaptation” of the generalised Fourier 
transform methodology proposed by Boyarchenko and Levendorskii [3]. 


4 Empirical study 
This section gives anumerical analysis of jumps, stochastic interest rates and mortality 
effects. To study the impacts of jumps and interest rates, a numerical analysis is 


performed in a first section while a second subsection examines all these risk factors 
together. 


2 A detailed account is available from the authors upon request. 
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4.1 Impact of jumps and interest rates 


The GMDB contract expiry is set at age 75. A guaranty cap of 200 % of the initial 
investment is also among the terms of the contract. 

The Gompertz mortality model is used in this subsection. The Gompertz param- 
eters used in this subsection and the next one are those calibrated to the 1994 Group 
Annuity Mortality Basic table in Milevsky and Posner [8]. They are recalled in Table 1. 


Table 1. Gompertz distribution parameters 


Female Male 
Age (years) m b m b 
30 88.8379 9.213 84.4409 9.888 
40 88.8599 9.160 84.4729 9.831 
50 88.8725 9.136 84.4535 9.922 
60 88.8261 9.211 84.2693 10.179 
65 88.8403 9.183 84.1811 10.282 


A purely diffusive model with a volatility of 20 % serves as a benchmark through- 
out the study. It corresponds to the model used by Milevsky and Posner [8]. 

The particular jump diffusion model used in the following study is the one pro- 
posed by Kou [5]. Another application in life insurance can be seen in Le Courtois 
and Quittard-Pinon [6]. In this model, jump sizes J = In(Y) are i.i.d. and follow a 
double exponential law: 


fry) = phe lys0 + qaze”” Ly<o, (22) 


with p > 0,q¢ >0,p+q=1,2, > Oand d2 > 0. 

The following Kou model parameters are set as follows: p = 0.4, 4; = 10 and 
Az = 5. The jump arrival rate is set to 2 = 0.5. The diffusive part is set so that the 
overall quadratic variation is 1.5 times the variation of the no-jump case. 

Table 2 shows the percentage of premium versus the annual insurance risk charge 
in the no-jump case and the Kou jump diffusion model case for a female policyholder. 
A flat interest rate term structure was taken into account in this table and set atr = 6 %. 

The initial yield curve y(0, ¢) is supposed to obey the following parametric equa- 
tion: y(0, t) = a — Be~”' where a, f and y are positive numbers. The yield is also 
supposed to converge towards r for longer maturities. The initial yield curve equation 
is set as follows: 


y(0, t) = 0.0595 — 0.0195 exp(—0.2933 t). (23) 


As stated earlier, the interest rate volatility structure is supposed to be of expo- 
nential form. Technically, it writes as follows: 


op(s,T) = oP (1 - gare), (24) 
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Table 2. Jumps impact, female policyholder; r = 6 %, g = 5%, 200 % cap 


Purchase age No-jump case Kou model 
(years) (%) (bp) (%) (bp) 
30 0.76 1.77 1.16 2.70 
40 1.47 4.45 2.04 6.19 
50 2.52 10.85 3.21 13.86 
60 2.99 21.58 3.55 25.74 
65 2.10 22.56 2.47 26.59 


Gompertz mortality model. In each case, the left column displays the relative importance of the 
M&E charges given by the ratio M E(€)/So. The right column displays the annual insurance 
risk charge ¢ in basis points (bp). 


where a > O. In the sequel, we will take op = 0.033333, a = 1 and the correlation 
between the zero-coupon bond and the underlying account will be set at p = 0.35. 
Plugging (24) into (15) allows the computation of See 


Bp = (AEE —35F) + (024 Sh meree) 74 (SP Mage) eat She MT, (95) 
The results displayed in Table 3 show that stochastic interest rates have a tremen- 
dous impact on the fair value of the annual insurance risk charge across purchase age. 
Table 3 shows that a 60-year-old male purchaser could be required to pay a risk charge 
as high as 88.65 bp for the death benefit in a stochastic interest rate environment. 
Thus, the stochastic interest rate effect is significantly more pronounced than the 
jump effect. Indeed, the longer the time to maturity, the more jumps tend to smooth 
out, hence the lesser impact. On the other hand, the stochastic nature of interest rates 
are felt deeply for the typical time horizon involved in this kind of insurance contract. 
It is to be noted that the annual insurance risk charge decreases after age 60. This 
decrease after a certain purchase age will be verified again with the figures provided 
in the next section. Indeed, the approaching contract termination date, set at age 75 
as previously, explains this behaviour. 


4.2 Impact of combined risk factors 


The impact of mortality models on the fair cost of the GMDB is added in this subsec- 
tion. Melnikov and Romaniuk’s [17] Gompertz and Makeham parameters, estimated 
from the Human mortality database 1959-1999 mortality data, are used in the se- 
quel. As given in Table 4, no more distinction was made between female and male 
policyholders. Instead, the parameters were estimated in the USA. 

In the following figure, the circled curve corresponds to the no-jump model with 
a constant interest rate. The crossed curve corresponds to the introduction of Kou 
jumps but still with a flat term structure of interest rates. The squared curve adds 
jumps and stochastic interest rates to the no-jump case. These three curves are built 
with a Gompertz mortality model. The starred curve takes into account jumps and 
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Table 3. Stochastic interest rates impact, male policyholder; g = 5 %, 200 % cap 


Purchase age Kou model (flat rate) Kou model (stochastic rates) 
(years) (%) (bp) (%) (bp) 

30 2.01 4.86 8.87 22:2] 

40 3.46 10.99 11.38 37.81 

50 5.35 24.46 13.38 64.07 

60 5.81 44.82 11.14 88.65 

65 4.08 46.31 6.82 78.55 


Gompertz mortality model. In each case, the left column displays the relative importance of the 
M&E charges given by the ratio M E(€)/So. The right column displays the annual insurance 
risk charge ¢. 


Table 4. Gompertz (G) and Makeham (M) mortality model parameters for the USA [7] 


A B Cc 
Gus 6.148 x 1075 1.09159 
Mus 9.566 x 10~4 5.162 x10~> 1.09369 


Table 5. Mortality impact on the annual insurance risk charge (bp), USA; g = 5 %, 200 % cap 


Gompertz Makeham 

Age No jumps Kou (flat) Kou (stoch.) Kou (stoch.) 
30 4.79 6.99 30.23 32.20 
40 11.16 15.15 50.86 52.34 
50 24.88 31.50 82.50 83.03 
60 44.45 52.97 105.27 104.77 
65 45.20 53.18 90.41 89.78 


stochastic interest rates but changes the mortality model to a Makeham one. Figure | 
displays the annual risk insurance charge with respect to the purchase age in the USA. 
From 30 years old to around 60 years old, the risk charge is steadily rising across all 
models. It decreases sharply afterwards as the contract expiry approaches. 

The two lower curves correspond strikingly to the flat term structure of the interest 
rate setting. The jump effect is less pronounced than the stochastic interest rate effect 
as represented by the two upper curves. The thin band in which these upper curves 
lie shows that the change of mortality model has also much less impact than the 
stochastic nature of interest rates. 

As is reported in Table 5, and displayed in Figure 1, the behaviour of the insurance 
risk charge with respect to age is of the same type whatever the considered model. 
However, within this type, differences can be seen. First, the jump effect alone does not 
change the fees very much but there are more differences when stochastic interest rates 


292 F. Quittard-Pinon and R. Randrianarivony 


Mortality impact -- USA 
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Fig. 1. Annual risk insurance charge in basis points, USA 


are introduced. In this case, fees are notably higher. Second, the choice of mortality 
model does not have a significant impact. 


5 Conclusions 


To analyse the effects of jump risk, stochastic interest rate and mortality on GMDBs, 
this paper assumes a particular jump diffusion process, namely a Kou process, for the 
return of the policyholder subaccount and a Vasicek term structure of interest rate, 
while the mortality is of a Gompertz or a Makeham type. The contract fair value is 
obtained using a methodology based on generalised Fourier analysis. It is shown that 
the largest impact among the three risk factors on the GMDB fees is due to stochastic 
interest rate. Jumps and mortality have smaller influence. The fair insurance risk 
charges are found to be significantly higher than Milevsky and Posner [8] reported, 
but still below the fees required by insurance companies. 
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Solvency evaluation of the guaranty fund at a large 
financial cooperative 


Jean Roy 


Abstract. This paper reports on a consulting project whose objective was to evaluate the 
solvency of the guaranty fund of the Mouvement Desjardins, a large federation of financial 
cooperatives based in Quebec, Canada. The guaranty fund acts as an internal mutual insurance 
company; it collects premiums from the 570 local credit unions of the federation and would 
provide funds to any of these local credit unions facing financial difficulties. At the time of the 
study, the assets of the insured credit unions totalled 79 billion CA$ and the fund had a capital of 
523 million CA$. The purpose of the study was to estimate the probability of insolvency of the 
fund over various horizons ranging from one to 15 years. Two very different approaches were 
used to obtain some form of cross-validation. Firstly, under the highly aggregated approach, 
three theoretical statistical distributions were fitted on the 25 historical yearly rates of subsidy. 
Secondly, a highly disaggregated Monte-Carlo simulation model was built to represent the 
financial dynamics of each credit union and the guaranty fund itself, taking into account some 
150 parameters for each credit union. Both approaches converged to similar probabilities of 
insolvency for the fund, which indicated that the fund was well within an implicit AAA rating. 
The study had several significant financial impacts both internally and externally. 


Key words: solvency analysis, financial cooperatives, guaranty fund, Monte Carlo method, 
credit risk 


1 Introduction 


The regulatory context brought by the Basel II accord has given a new impetus to the 
evaluation of the solvency of financial institutions. Although internationally active 
commercial banks have been at the forefront, other financial institutions, such as large 
financial cooperatives, are also strongly involved. The decentralised nature of financial 
cooperatives brings new challenges to the process, as the case study presented here 
will show. Specifically, this paper will report on a consulting project whose objective 
was to evaluate the solvency of the guaranty fund of the Mouvement Desjardins, a 
large federation of credit unions based in Quebec, Canada. The paper will proceed as 
follows. Section 2 will describe the institutional and technical context of the study. 
Section 3 will present the preliminary analysis that was conducted to identify and 
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eventually select the methods to be applied. The two polar approaches in terms of ag- 
gregation of data were implemented to obtain some form of cross-validation. Section 
4 will document the aggregated approach, whereas Section 5 will describe the highly 
disaggregated approach implemented through a Monte-Carlo simulation model. Sec- 
tion 6 will compare the results of the two approaches, whereas Section 7 will report 
on the several significant impacts of the study for the organisation. Section 8 will 
provide the conclusion. 


2 Context of the study 


The context of this study had two main dimensions, one internal, namely the institu- 
tional context, and the other external, namely the technical context. Both will now be 
addressed. 

The institution involved in this study is the “Fonds de sécurité Desjardins" or 
Desjardins Guaranty Fund (DGF), which is a wholly owned incorporated affiliate of 
the “Mouvement Desjardins". The Mouvement Desjardins is a federation of some 570 
local “caisses populaires" or credit unions. DGF acts as an internal mutual insurance 
company. It collects annual premiums from the local credit unions and would provide 
funds to any ot these credit unions in a situation of financial distress. In 2004, DGF 
had a capital of 523 million CA$, whereas the insured credit unions had total assets 
of 79 billion CA$, leading to a capitalisation ratio of 66.2 basis points. DGF had, at 
that point, a capitalisation target of 100 basis points. However, management wanted a 
formal evaluation of the solvency of the fund in order to confirm or review this target. 
To perform the analysis, data for the 25 years of formal operation of the fund were 
available. These showed that the fund had played a major role in keeping the local 
credit unions solvent. Indeed, 964 subsidies were granted to 372 different credit unions 
for a total of some 220 million un-indexed CA$ from 1980 to 2004. It also needs to 
be mentioned that the federation played a key role in managing a consolidation of 
the network, bringing the total number of credit unions down from its peak of 1465 
credit unions in 1982 to 570 in 2004. 

A first look at the data showed that the annual average subsidy rate, total subsidies 
to credit unions divided by the total assets of these, was 3.245 basis points, such that 
the current capital was equivalent to a 20.4-year reserve at the average rate. However, 
a cursory analysis of the time series showed a high volatility (4.75 basis points) and 
also high asymmetry and kurtosis. More precisely, two peaks could be identified in 
the series: one at 20.8 basis points in 1982 and another one at 5.8 basis points in 1995. 
Overall, five high values above 5 basis points could be observed during the period of 
25 years. Aside from the historical data of the fund itself, extensive data were also 
available on the insured credit unions over the latest period of seven years, allowing 
contemplation of highly disaggregated models. At that point, a review of the literature 
was conducted to survey the status of current practices for the solvency evaluation of 
similar organisations. 

DGF is an organisation that shares many similarities with public deposit insurers. 
Thus the literature on the solvency evaluation of the US and Canadian deposit insurers 
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was reviewed: the three studies of the Federal Deposit Corporation by Sheehan [5], 
Bennett [1] and Kuritzkes et al. [2] and the study of the Canada Deposit Insurance 
Corporation by McIntyre [3]. Overall, the survey of this literature showed that the 
dominant approach to estimate the solvency of a deposit insurer was the use of a credit 
risk simulation model and it appeared natural to follow this practice. However, before 
proceeding, it seemed appropriate to identify the methodological options and make a 
deliberate choice. 


3 Analysis and selection of methodologies 


The process consisted in identifying the various possible approaches, evaluating them 
and eventually selecting one or several for implementation. After some analysis, four 
dimensions emerged to characterise the possible approaches, namely, the level of 
aggregation of the data, the estimation technique, the depth of historical data and the 
horizon considered for the future. The following sections will look at each in turn. 

The demand for subsidies by the credit unions depends on the losses that these 
incur, which depends in turn on the risks they bear. This observation leads to consider 
the credit unions as a single aggregated entity or to proceed to a disaggregated analysis 
of each credit union one by one. Similarly, the total risk can be analysed as an aggregate 
of all risks or risks can be analysed individually (e.g., credit risk, market risk and 
operational risk). Finally, credit risk itself can be analysed at the portfolio level or can 
be analysed by segments according to the type, the size and the risk of loans. 

To estimate the distribution of the demand for subsidies by credit unions, two 
techniques appeared possible. If aggregate data were used, it would be possible to fit 
theoretical statistical distributions to the historical distribution. However if disaggre- 
gated data were used, a Monte Carlo simulation model would be more appropriate to 
estimate the distribution of the demand for subsidies. 

If aggregated data were used, 25 years of data would be available. On the other 
hand, if disaggregated data were used, only seven years of data would be available. 

If theoretical statistical distribution were used, the model would be static and the 
horizon of one year would logically follow from the yearly period of observation 
of historical data. If a simulation model was used, the financial dynamics of credit 
unions and of the guaranty fund itself could be modelled and trajectories over time 
could be built. In this case, a horizon of 15 years was considered relevant. 

As the analysis above has shown, even though the four dimensions were not 
independent, several combinations of choices could be implemented and these could 
be viewed as forming a spectrum mainly according to the level of aggregation of data, 
which seemed to be the dimension that had the strongest impact on conditioning the 
other choices. In this light, it was decided to move forward with the implementation of 
two polar choices, namely a highly aggregated approach and a highly disaggregated 
approach. Table 1 summarises the characteristics of each of these two approaches. 

It was deemed interesting to implement two very different approaches and to ob- 
serve whether they would converge or not to similar results. If similarity was obtained, 
then a cross-validation effect would increase confidence in the results. If dissimilarity 
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Table 1. Characterisation of the two approaches selected for implementation 


Aggregated Disaggregated 
Aggregation of data High Low 
Estimation technique Theoretical distributions Monte Carlo Simulation 
Depth of historical data 25 years 7 years 
Horizon for projection 1 year — static 15 years — dynamic 


was obtained, then an analysis would have to be conducted to understand the sources 
of differences and eventually decide on which approach seems more reliable. The 
next two sections will describe the implementation and the results obtained under the 
two approaches. 


4 The aggregated approach 


Under this approach, the aggregate demand for subsidies by the credit unions will be 
estimated using 25 historical observations of the aggregate rate of subsidy defined as 
the sum of the subsidies for the year divided by the total assets of the credit unions 
at the beginning of the year. Three theoretical distributions were selected, namely 
the Weibull, the Gamma and the Lognormal. Each of these distributions was fitted to 
the historical cumulative distribution of the rate of subsidy. These three distributions 
have some common features: they are characterised by two parameters and they 
accommodate asymmetry. 

Before providing the results of the estimation process, it is appropriate to mention 
that this approach implies two strong hypotheses. First, one must assume that the de- 
mand for subsidy has had and will continue to have a distribution that is stable in time. 
This is indeed a strong assumption as both internal and external structural conditions 
have evolved significantly. Internally, a strong consolidation of credit unions has taken 
place which resulted in more than halving their total number giving rise to bigger and 
hopefully stronger units. Externally, the monetary policy of the Bank of Canada has 
changed over the years and the strong emphasis now put on the control of inflation 
will avoid the high nominal interest rates that took place in the early 1980s and which 
generated massive credit problems. Second, this approach also assumes implicitly 
that there is no serial correlation, which is most likely contrary to reality as there 
were clearly periods of good times and bad times that extended over several years. 
Overall, the first assumption points to overestimating the current demand, whereas 
the second may lead to underestimating demand in extended periods of difficulties. 
One may hope that the net effect of the two biases is small. Finally, the depth of the 
historical data allows the inclusion of two periods of difficulties, which may represent 
other unknown difficulties that may arise in the future. 

With these considerations in mind, we estimated the parameters of the three distri- 
butions using non-biased OLS; the results are shown in Table 2 together with various 
statistics. Overall, the statistics seem to show a reasonably good fit of the distributions 
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to the historical data. With these distributions on hand, one now needs to evaluate the 
probability that the demand for subsidies is inferior to the current level of capital. 
It must be remembered that historical values have occurred in the interval ranging 
from 0 to 20.8 basis points. On the other hand the current level of capital was 66.2 
bp. Thus one must extrapolate the distributions to a point that is more than three 
times bigger than the biggest value ever observed. With the awareness of this fact, we 
proceeded to evaluate the three estimated distributions at the point corresponding to 
the current level of capital. Table 3 presents the results that were obtained together 
with the implied rating according to the scales of S&P and Moody’s. 


Table 2. Three estimated distributions of the aggregate demand for subsidies 


Distribution Weibull Gamma Log-normal 
Parameter 1 a = 0.742 a = 0.527 w=—8.95 
Parameter 2 B =0.000225 £=0.000616 o =1.36 
Mean value 0.03245 0.03245 0.03245 

R2 99.14 % 98.37 % 99.06 % 
Kolmogorov-Smirnov Test 0.943 0.964 0.964 

Chi squared test 0.139 0.505 0.405 

Chi squared test w/o the greatest contribution 0.964 0.948 0.954 


Table 3. Solvency estimates of the guaranty fund using statistical distributions 


Distribution Probability Probability of S&P rating Moody’s rating 
of solvency default (in bp) 


Weibull 99.9995 % 0.05 AAA Aaa 
Gamma 99.9996% 0.04 AAA Aaa 
Log-normal 99.8142% 18.58 BBB Baa2 


As can be observed, the Weibull and the Gamma distributions give very similar 
results, whereas the log-normal distribution points to a higher probability of default 
and accordingly a lower credit risk rating. Under the first two distributions, the guar- 
anty fund achieves the implied triple A rating very easily because a probability of 
default of less than | basis point is enough to obtain this rating. 

Thus, the aggregated approach has allowed the estimation of the solvency of the 
fund. Accepting their admittedly strong assumptions, two out of three theoretical 
distributions lead to a very strong evaluation of the solvency of the fund, the third 
distribution showing a somewhat weaker position of the fund. 
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5 The disaggregated approach 


Under the disaggregated approach, the income statement of each credit union is simu- 
lated using six stochastic variables: namely, the net interest income, loan losses, other 
income, other expenses, operating expenses and operational losses. Overall, these six 
variables are meant to capture interest rate risk, market risk, credit risk and opera- 
tional risk. Once the income statements are simulated, the balance sheets of the credit 
unions are computed. Should the capital be under the regulatory requirement, then a 
demand for subsidy at the guaranty fund will be generated. After the total demand for 
subsidies is computed, the financial statements of the guaranty fund are simulated. 
Trajectories simulating 15 consecutive years are run using 50000 trials in order to 
obtain the distribution of the level of capital of the fund and thus estimate its solvency 
over the 15-year horizon. 

Let us now examine in more details how the six stochastic variables were modelled. 
Special attention was devoted to credit risk, as it is believed to be the most important 
source of risk. Thus, the loan portfolio of each credit union was in turn divided into 
ten types of loans, namely: consumer loans, mortgages, investment loans, commercial 
loans, agricultural loans, institutional loans, lines of credit to individuals, commercial 
lines of credit, agricultural line of credit and institutional lines of credit. In turn, each 
of these ten types of loans was divided into three size categories, namely: below 
$100 000, between $100 000 and $1 000 000, and above $1 000000. Now, for each of 
these 30 credit segments, five historical parameters were available: the probability of 
default (PD), the exposition at default (EAD), the loss given default (LGD), the number 
of loans in the segment (N) and the correlation factor with a global latent economic 
factor (p). A Merton-type model is then used. First, the value of the latent economic 
factor is drawn and then the PD of each segment is conditioned on its value. Secondly, 
the number of defaults (ND) in a segment is generated with a binomial distribution 
using the number of loans in the segment N and the conditional PD. Finally, the loan 
losses are obtained as the product of the number of default ND, the exposure at default 
EAD and the loss given default LGD. The five other stochastic variables are simulated 
using normal distributions using expected values and standard deviations. Two sets 
of assumptions are used: a base case using historical values and a stressed case where 
the standard deviations were increased by 50% to represent higher risks. Finally, 
a correlation structure was modelled. Serial correlation factors were assumed for 
the latent economic factor (0.5) to represent business/credit cycles, for the operating 
expenses (0.5) and for net interest revenue (0.5) to represent the inertia of these. A 
negative cross correlation factor (—0.55) was also introduced between net interest 
revenues and operational losses. 

Following the simulation of the financial statements of the credit unions, those 
of the guaranty fund are generated. The fund has two types of revenues: revenues 
obtained from investing its assets and premiums collected from the credit unions. It 
has three main types of expenses: fixed administrative expenses, payment of subsidies 
to the needing credit unions and taxes on its profit. Stochastic values were generated 
for investment income, which were correlated to the latent economic factor, and for 
the subsidies paid through the simulation described in the above subsection. Finally, 
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two policies for premiums were simulated: the current policy which is expressed as 
1/14 of 1% of the risky assets of the insured credit union and a hypothetical relaxed 
policy of 1/17 of 1% of the risky assets. The latter policy was simulated because it 
was anticipated that given the excellent capital of the fund it could maintain a good 
solvency level while easing its burden on its insured members. 

In total four combinations of scenarios were simulated according to whether the 
parameters had base-case or stressed values and to whether the policy for premiums 
was modelled as base-case or relaxed. Table 4 below shows the results for each of 
the four cases. The probabilities estimated over the 15-year horizon were converted 
to a one-year horizon as it is this latter horizon that is used for reference by rating 
agencies such as S&P and Moody’s. 

It is striking that under the two base case scenarios, the level of insolvency is 
much lower than one basis point, thus allowing an implied credit rating of AAA to 
be granted to the fund. Under the two stressed cases, the level of solvency is close to 
the threshold needed to get a triple A rating. Overall, the simulation model leads to 
the belief that the solvency of the fund is indeed excellent. 


Table 4. Solvency estimates of the guaranty fund by Monte Carlo simulation 


Parameters Base case Stressed case 

Policy for premiums 1/14% 1/17 % 1/14 % 1/17% 
Nb of cases of insolvency 6 10 74 101 

Nb of cases of solvency 49994 49990 49926 49899 
Total number of cases 50000 50000 50000 50000 
Solvency over 15 years 99.9880 % 99.9800 % 99.8520 % 99.7980 % 
Solvency over | year 99.9992 % 99.9987 % 99.9901 % 99.9865 % 
Insolvency over 15 years 0.0120 % 0.0200 % 0.1480 % 0.2020 % 
Insolvency over | year 0.0008 % 0.0013 % 0.0099 % 0.0135 % 
Implied rating AAA AAA AAA AAA 


6 Comparison of the two approaches 


It is now interesting to compare the results obtained under the aggregated and dis- 
aggregated approaches. Table 5 summarises these results. Overall, Table 5 provides 
a sensitivity analysis of the solvency estimates while varying methods and hypothe- 
ses about distributions, parameters and premium policies. Obviously, there is a wide 
margin between the best and the worst estimates. However, apart from the log-normal 
distribution, all other results are basically in the same range. One clear advantage of 
the simulation model is that it allows the analysis of hypothetical cases, as was done 
in the last three scenarios considered. So, to make a fair comparison between the 
statistical distribution approach and the simulation approach one must use the first 
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Table 5. Summary of solvency estimates under the two approaches 


Approach Probability of | Probability of | Implied S&P 
solvency default (bp) rating 
Statistical distribution 
Weibull 99.9995 % 0.05 AAA 
Gamma 99.9996 % 0.04 AAA 
Log-normal 99.8142 % 18.58 BBB 
Monte Carlo simulation 
Base case and premiums at 1/14 % 99.9992 % 0.08 AAA 
Base case and premiums at 1/17 % 99.9987 % 0.13 AAA 
Stressed case and premiums at 1/14% 99.9901 % 0.99 AAA 
Stressed case and premiums at 1/17% 99.9865 % 1.35 AAA 


base case scenario. Then one observes that the probability of insolvency obtained 
(0.08 bp) is quite consistent with the values obtained using the Weibull and Gamma 
distributions (0.05 bp and 0.04 bp). This observation is quite comforting as we inter- 
pret it as each result being reinforced by the other. It is striking indeed that the two 
very different approaches did in fact converge to basically similar results. Overall, it 
can be concluded that the solvency of the guaranty fund was excellent and that this 
conclusion could be taken with a high level of confidence considering the duplication 
obtained. 


7 Financial impact of the study 


The study led the management of the fund to take several significant actions. First, 
the target capital ratio, which was previously set to 1% of the aggregate assets of the 
credit unions, was brought down to an interval between 0.55% and 0.65%, basically 
corresponding to the current level of capital of 0.62%. Secondly, management decided 
to lower the premiums charged to the credit unions. Finally, the deposits made at any of 
the credit unions of the Mouvement Desjardins are also guaranteed by a public deposit 
insurer managed by the Financial Market Authority of Quebec (FMA) to which the 
Mouvement Desjardins has to pay an annual premium. The solvency study that we 
have described above was presented to the FMA to request a reduction of the premium 
and after careful examination the FMA granted a very significant reduction. Thus, 
one could argue that the study achieved its goals. First, it provided management of the 
guaranty fund with the information it requested, that is a well grounded estimation of 
the solvency of the fund. Secondly, the very favourable assessment of the solvency 
allowed management to take several actions to reap the benefits of the excellent state 
of solvency of the fund. 
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8 Conclusions 


This study faced the challenge of estimating the solvency of the guaranty fund of 
523 million CA$ insuring a network of 570 credit unions totalling some 79 billion 
CAS in assets invested in a broad spectrum of personal, commercial, agricultural and 
institutional loans. As a preliminary step, the array of possible approaches according 
to the aggregation of data, the estimation technique, the depth of historical data and 
the projection horizon was examined. After analysis, two polar approaches in terms 
of the aggregation of data were selected for implementation. Under the aggregated 
approach three statistical distributions were fitted to twenty five yearly observations 
of the total rate of subsidy. Under the disaggregated approach, an elaborate Monte 
Carlo simulation model was set up whereby the financial statements of each credit 
union and those of the guaranty fund itself were generated, integrating four types of 
risks and using more than 7500 risk parameters, mainly representing credit risk at 
a very segmented level. The Monte Carlo simulation was also used to evaluate the 
impact of stressed values of the risk parameters and of a relaxation of the policy for 
the premiums charged by the fund. Overall, both approaches converged to similar 
estimates of the solvency of the fund, thus reinforcing the level of confidence in the 
results. Accordingly, the solvency of the fund could be considered as excellent, being 
well within an implied triple A rating under the base case scenario, and still qualifying, 
although marginally, for this rating under stressed hypotheses. The detailed analysis 
of the solvency of the fund and the good evaluation it brought had three significant 
financial impacts: the target capital ratio of the fund was revised downward, the 
premiums charged to credit unions were reduced and the Mouvement Desjardins 
itself obtained a sizable reduction of the premium it pays to the public deposit insurer. 
Needless to say, management of the guaranty fund was quite satisfied with these 
outcomes. Finally, it was decided to update the study every five years. From this 
perspective, several improvements and extensions, namely regarding a more refined 
modelling of the risk factors other than credit risk and a more adaptive premium 
policy, are already envisaged. 
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A Monte Carlo approach to value exchange options 
using a single stochastic factor 


Giovanni Villani 


Abstract. This article describes an important sampling regarding modification of the Monte 
Carlo method in order to minimise the variance of simulations. In a particular way, we propose 
a generalisation of the antithetic method and a new a-sampling of stratified procedure with 
ax 5 to value exchange options using a single stochastic factor. As is well known, exchange 
options give the holder the right to exchange one risky asset V for another risky asset D and 
therefore, when an exchange option is valued, we generally are exposed to two sources of 
uncertainity. The reduction of the bi-dimensionality of valuation problem to a single stochastic 
factor implies a new stratification procedure to improve the Monte Carlo method. We also 
provide a set of numerical experiments to verify the accuracy derived by a-sampling. 


Key words: exchange options, Monte Carlo simulations, variance reduction 


1 Introduction 


Simulations are widely used to solve option pricing. With the arrival of ever faster 
computers coupled with the development of new numerical methods, we are able to 
numerically solve an increasing number of important security pricing models. Even 
where we appear to have analytical solutions it is often desirable to have an alternative 
implementation that is supposed to give the same answer. Simulation methods for 
asset pricing were introduced in finance by Boyle [3]. Since that time simulation has 
been successfully applied to a wide range of pricing problems, particularly to value 
American options, as witnessed by the contributions of Tilley [10], Barraquand and 
Martineau [2], Broadie and Glasserman [4], Raymar and Zwecher [9]. 

The aim of this paper is to improve the Monte Carlo procedure in order to evaluate 
exchange options generalizing the antithetic variate methodology and 
proposing a new stratification procedure. To realise this objective, we price the most 
important exchange options using a single stochastic factor P that is the ratio between 
the underlying asset V and the delivery one D. For this reason, we need a particular 
sampling to concentrate the simulations in the range in which the function P is more 
sensitive. 
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The most relevant models that value exchange options are given in Margrabe [7], 
McDonald and Siegel [8], Carr [5,6] and Armada et al. [1]. Margrabe [7] values an 
European exchange option that gives the right to realise such an exchange only at 
expiration. McDonald and Siegel [8] value an European exchange option considering 
that the assets distribute dividends and Carr [5] values acompound European exchange 
option in which the underlying asset is another exchange option. However, when the 
assets pay sufficient large dividends, there is a positive probability that an American 
exchange option will be exercised strictly prior to expiration. This positive probability 
induced additional value for an American exchange option as given in Carr [5,6] and 
Armada et al. [1]. 

The paper is organised as follows. Section 2 presents the estimation of a Simple 
European Exchange option, Section 3 introduces the Monte Carlo valuation of a 
Compound European Exchange option and Section 4 gives us the estimation of a 
Pseudo American Exchange option. In Section 5, we apply new techniques that allow 
reduction of the variance concerning the above option pricing and we also present 
some numerical studies. Finally, Section 6 concludes. 


2 The price of a Simple European Exchange Option (SEEO) 


We begin our discussion by focusing on a SEEO to exchange asset D for asset V at 
time T. Denoting by s(V, D, T — t) the value of SEEO at time f, the final payoff at 
the option’s maturity date T is s(V, D, 0) = max(0, Vr — Dr), where Vr and Dr 
are the underlying assets’ terminal prices. So, assuming that the dynamics of assets 
V and D are given by: 


dV 
ye = (Mo = Op )dt + oy dZy, (1) 
dD 
a (ua — 6a)dt + oadZa, (2) 
dV dD 
Cov (F. 3) = PvdOy0a at, (3) 


where “4, and wg are the expected rates of return on the two assets, 6, and dg are the 
corresponding dividend yields, a; anda 4 are the respective variance rates and Z, and 
Za are two Brownian standard motions with correlation coefficient p)¢, Margrabe [7] 
and McDonald and Siegel [8] show that the value of a SEEO on dividend-paying 
assets, when the valuation date is t = 0, is given by: 


s(V, D, T) = Ve~®" N(d\(P, T)) — De" N(do(P, T)), (4) 


Vv. ‘ : 
e P= D> GF Ns — 2pydoyoq + CoE 0 = Oy — Og; 


= log P+(5-0)T : 


oV/T ? dy(P,T) =d\(P,T)-—oVT; 
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e N(d) is the cumulative standard normal distribution. 


The typical simulation approach is to price the SEEO as the expectation value of 
discounted cash-flows under the risk-neutral probability Q. So, for the risk-neutral 
version of Equations (1) and (2), itis enough to replace the expected rates of return 4; 
by the risk-free interest rate r plus the premium-risk, namely uw; = r + A;0;, where 
A; is the asset’s market price of risk, for i = V, D. So, we obtain the risk-neutral 
stochastic equations: 


dV 
SW = Sy)dt + oy (dZy + Aydt) =F — pdt + yd Z5, (5) 
dD ’ 

=~ daldt + oa(dZa + Aadt) = (r ~ da)dt + o4dZ}. (6) 


The Brownian processes dZ¥ = dZ, + Aydt and dZ7, = dZq + Agdt are the new 
Brownian motions under the risk-neutral probability Q and Cov(dZ}, dZ7) = pyadt. 


DvD? 
Applying Ito’s lemma, we can reach the equation for the ratio-price simulation P = s 
under the risk-neutral measure Q: 


dP : F 
= (—6 + 07 — o,0apoa) dt + odZ* — oadZ*. (7) 


Applying the log-transformation for D;, under the probability Q, it results in: 


2 
Dr = Do epltr=apn)-cx0(—SEr + u2it0)), (8) 


2 2 
We have that U = (-4 T + 047()) ~ n (-$r. cuvT) and 
therefore exp(U) is a log-normal whose expectation value is Eg [exp(U )| — 


2 2 
exp (- “iT + $r) = 1. So, by Girsanov’s theorem, we can define the new prob- 


ability measure Q equivalent to Q and the Radon-Nikodym derivative is: 


d O oF 

—_ = —-T ZT) }. 9 
1D oo( 47 + oZ4{T) (9) 

Hence, using Equation (8), we can write: 

dQ 

Dr = Doe®— OF . —*, 10 
T 0e 10 (10) 

By the Girsanov theorem, the processes: 
dZq = dZ* — oadt, (11) 


a2, = PoadZa + V 1- pay dZ’, (12) 
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~ 


are two Brownian motions under the risk-neutral probability measure Q and Z’ is 


a Brownian motion under Q independent of Za: By the Brownian motions defined 
in Equations (11) and (12), we can rewrite Equation (7) for the asset P under the 


risk-neutral probability Q. So it results that: 
dP 


p = Wodt +o dZ,y —oqdZq. (13) 


Using Equation (12), it results that: 
ovdZy a oadZa = (Oy Pod — Fd) dZa + Oy (y 1- 7a) dZ’, (14) 


where Z, and Z’ are independent under Q. Therefore, as (o,dZ, - oqdZa) ~ 
N(0, oVdt), we can rewrite Equation (13): 


dP 
> = —bdt + odZp, (15) 


where o = , lo + Or — 26y64Pva and Zp is a Brownian motion under Q. Using the 


log-transformation, we obtain the equation for the risk-neutral price simulation P: 


2 
= Prep |(-0- F)r+oz co}. (16) 


So, using the asset Dy as numeraire given by Equation (10), we price a SEEO as the 
expectation value of discounted cash-flows under the risk-neutral probability measure: 
s(V, D,T) =e" Eg[max(0, Vr — Dr)] 

= Doe “TE ~[s(Pr)], (17) 


where gs(Pr) = max(Pr — 1, 0). Finally, it is possible to implement the Monte Carlo 
simulation to approximate: 


Dees sks 
Es les(Pr il = > 83(Pp), (18) 
i=l 


where n is the number of simulated paths effected, pi fori = 1, 2...n arethe simulated 
values and gi(Pi) = max(0, ae — 1) are the n simulated payoffs of SEEO using a 
single stochastic factor. 


3 The price of a Compound European Exchange Option (CEEO) 


The CEEO is a derivative in which the underlying asset is another exchange option. 
Carr [5] develops a model to value the CEEO assuming that the underlying asset is 
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a SEEO s(V, D, t) whose time to maturity is 7 = T — ft, witht; < T, the exercise 
price is a ratio g of asset D at time f) and the expiration date is t;. So, considering 
that the valuation date is t = 0, the final payoff of CEEO at maturity date fy is: 


c(s(V, D, t), qD, 0) = max[0, s(V, D, t) — gD]. 


Assuming that the evolutions of assets V and D are given by Equations (1) and (2), 
under certain assumptions, Carr [5] shows that the CEEO price at evaluation date 
t = Ois: 


P 
c(s(V, D, t),qD, t1) = Ve" Np (a (=. n) »21(P, rsp) 
1 
—8iT P 
— De 4 m2 (a (Z.n).a07.7:0) 
1 


P 
— qDe*""1N (a (=. n)) (19) 
Py 


where N2(x1, x2; p) is the standard bivariate normal distribution evaluated at x; and 


x2 with correlation p = a and P} is the critical price ratio that makes the underlying 
asset and the exercise price equal and solves the following equation: 


Pye? N(d\(Pf, t)) — e 4" N(d2 (Pi, t)) = 4. (20) 


Itis obvious that the CEEO will be exercised at time ¢ if P;, > Py . We price the CEEO 
as the expectation value of discounted cash-flows under the risk-neutral probability 
Q and, after some manipulations and using D;, as numeraire, we obtain: 


c(s, qD, t)) = e~"™ Eg[max(s(V;,, Di, 7) — ¢Dr,, 0)] 
= Doe EX [c(Pu dh (21) 
where 
gc(Px,) = max[P, e~%* N(di(P,,, t)) — e!*N(do(Py, 7) — 4,0]. (22) 
Using a Monte Carlo simulation, it is possible to approximate the value of CEEO as: 


Dyan —.| 


n 


c(s, qD, t1) % Doe" ( (23) 


where 7 is the number of simulated paths and gi(Pi) are the n simulated payoffs of 
CEEO using a single stochastic factor. 


4 The price of a Pseudo American Exchange Option (PAEO) 


Let t = 0 be the evaluation date and T be the maturity date of the exchange option. Let 
S2(V, D, T) be the value of a PAEO that can be exercised at time f or T. Following 
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Carr [5,6], the payoff of PAEO can be replicated by a portfolio containing two SEEOs 
and one CEEO. Hence, the value of PAEO is: 


P T 
So(V, D, T) = Ve™T ND (-a (F. *) ,da1(P, T); -») 
Py 2 


4Ve-> IN d PT 
oN BED 
P T 
— De~" No ( —dy ( —, ~), (P,T); - 
e »( (FF): b(P, T) ») 


_pebn la (Ph. (24) 
2 re 2 b 


where p = ie = V0.5 and P} is the unique value that makes the PAEO exercise 


indifferent or note at time f and solves the following equation: 


Ho Od * T = 632 * T * 
PyE%EN (di P35) ) eM EN (do ( PH.) ) = PEL. 


The PAEO will be exercised at mid-life time ft if the cash flows (V7/2 — Dr/2) exceed 
the opportunity cost of exercise, i.e., the value of the option s(V, D, T/2): 


Vrj2— Drj2 = s(V, D, T/2). (25) 


It is clear that if the PAEO is not exercised at time f, then it’s just the value of a 


SEEO with maturity 5, as given by Equation (4). However, the exercise condition 
can be re-expressed in terms of just one random variable by taking the delivery asset 
as numeraire. Dividing by the delivery asset price D7 2, it results in: 


T T 
Pry —1 > Pre 2 N(d\(Pr/2, T/2)) — eZ N(do(Prj2,T/2)). (26) 


So, if the condition (26) takes place, namely, if the value of P is higher than Py at 


moment q the PAEO will be exercised at time f and the payoff will be (Vr /2—Dr/2); 
otherwise the PAEO will be exercised at time T and the payoff will be max[V7 — 
Dr, 0]. So, using the Monte Carlo approach, we can value the PAEO as the expectation 
value of discounted cash flows under the risk-neutral probability measure: 


T 
So(V, D, T) = eT 2 EQlVr/2 — Dr/2)1(P77> P3)] 
+e~"" Egimax(0, Vr — Dr) (Pp <P3)]: (27) 


Using assets D7/2 and Dr as numeraires, after some manipulations, we can write 
that: 


SV, D,T) = Dole" F Extes(Praylte™ Exte(Prl), 28) 
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where g5(Pr/2) = (Pr/2 — 1) if Prj2 => Py and g,(Pr) = max[Pr — 1,0] if 
Prj2 < P;. 
So, by the Monte Carlo simulation, we can approximate the PAEO as: 


Lies gi(Pryyje ar? + Dies | (29) 
n + 


5(V, D,T) = os( 


where A = {i = L.n st. Pj) > PZ} and B= {i = Ln st. Phy < PH}. 


5 Numerical examples and variance reduction techniques 


In this section we report the results of numerical simulations of SEEO, CEEO and 
PAEO and we propose a generalisation of the antithetic method and a new a-stratified 
sampling in order to improve on the speed and the efficiency of simulations. To 
compute the simulations we have assumed that the number of simulated paths n is 
equal to 500000. The parameter values are o, = 0.40, og = 0.30, pog = 0.20, 
dp = 0.15, dg = 0 and T = 2 years. Furthermore, to compute the CEEO we assume 
that t} = | year and the exchange ratio q = 0.10. Table | summarises the results of 
SEEO simulations, while Table 2 shows the CEEO’s simulated values. Finally, Table 
3 contains the numerical results of PAEO. 

Using Equation (16), we can observe that Y = In($+) follows anormal distribution 


. 2 ; : 
with mean (—d — 5-)t and variance o*t. So, the random variable Y can be generated 
by the inverse of the normal cumulative distribution function Y = F7!(u; (—éd — 
2 ; ; F : . 
= ts o°t) where u is a function of a uniform random variable U[0, 1]. Using the 


Matlab algorithm, we can generate the n simulated prices Pi , fori = 1...n, as: 
Pt=P0*exp (norminv(u,-d*t-0.5*sig*2*t,sig*sqrt(t))), 


where u = rand(1,n) are the n random uniform values between 0 and 1. As the 
simulated price P/ depends on random value u;, we write henceforth that the SEEO, 
CEEO and PAEO payoffs g;, fork = s,c using a single stochastic factor depend 


Table 1. Simulation prices of Simple European Exchange Option (SEEO) 


Vy Do SEEO (true) SEEO (sim) 6? En 62, Eff 62 Eff 62 


gst Effest 


180 180 19.8354 19.8221 0.1175 0.0011 0.0516 1.13 0.0136 4.32 1.02e-8 22.82 
180 200 16.0095 16.0332 0.0808 8.98e-4 0.0366 1.10 0.0068 5.97 8.08e-9 19.98 
180 220 12.9829 12.9685 0.0535 7.31e-4 0.0258 1.03 0.0035 7.56 5.89e-9 18.15 
200 180 26.8315 26.8506 0.1635 0.0013 0.0704 1.16 0.0253 3.23 1.27e-8 25.54 
200 200 22.0393 22.0726 0.1137 0.0011 0.0525 1.08 0.0135 4.19 1.03e-8 21.97 
200 220 18.1697 18.1746 0.0820 9.05e-4 0.0379 1.08 0.0072 5.65 8.37e-9 19.58 
220 180 34.7572 34.7201 0.2243 0.0015 0.0939 1.19 0.0417 2.68 1.54e-8 28.94 
220 200 28.9873 28.9479 0.1573 0.0013 0.0695 1.13 0.0238 3.30 1.23e-8 25.45 
220 220 24.2433 24.2096 0.1180 0.0011 0.0517 1.14 0.0135 4.35 1.03e-8 22.88 
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Table 2. Simulation prices of Compound European Exchange Option (CEEO) 


Vo Do CEEO (true) CEEO (sim) 6? En 62, Effay G2 Effyy 62 


gst Effgst 


180 180) -11.1542 11.1590 0.0284 2.38e—4 0.0123 1.15 0.0043 3.30 2.23e—9 25.44 
180 200 8.0580 8.0830 0.0172 1.85e—4 0.0078 1.10 0.0019 4.64 1.6le—9 21.28 
180 220 5.8277 5.8126 0.0103 1.43e—4 0.0048 1.06 0.0008 6.30 1.15e—9 17.89 
200 180 16.6015 16.6696 0.0464 3.04e—4 0.0184 1.25 0.0089 2.60 2.99e—9 30.94 
200 200 = 12.3935 12.4010 0.0283 2.37e—4 0.0124 1.14 0.0043 3.28 2.22e—9 25.40 
200 220 9.2490 9.2226 0.0179 1.89e—4 0.0080 1.11 0.0020 4.42 1.67e—9 21.37 
220 180 23.1658 23.1676 0.0684 3.69e—4 0.0263 1.30 0.0158 2.15 3.83e—9 35.71 
220 200 = 17.7837 17.7350 0.0439 2.96e—4 0.0180 1.21 0.0083 2.65 2.9le—9 30.07 
220 220 = 13.6329 13.6478 0.0285 2.38e—4 0.0122 1.17 0.0043 3.33 2.22e—9 25.66 


Table 3. Simulation prices of Pseudo American Exchange Option (PAEO) 


Vo Do PAEO (true) PAEO(sim) 4? En a2, Elf 6% Effy G24  Elfigss 
180180 23.5056 23.5152 0.0833 9.12e-4 0.0333 1.26 0.0142 2.93 3.29e-8 25.31 
180 200 18.6054 18.6699 0.0581 7.62e-4 0.0250 1.16 0.0083 3.51 3.96e-8 14.65 
180220 14.8145 14.8205 0.0411 6.4le-4 0.0183 1.12 0.0051 4.00 3.72e-8 11.03 
200 180 32.3724 32.3501 0.1172 0.0011 0.0436 1.34 0.0247 2.36 3.44e-8 24.86 
200 200 26.1173 26.1588 0.0839 9.16e-4 0.0328 1.27 0.0142 2.95 3.27e-8 25.64 
200 220 21.1563 21.1814 0.0600 7.74e-4 0.0253 1.18 0.0053 3.43 3.83e-8 15.63 
220180 42.5410 42.5176 0.1571 0.0013 0.0536 1.46 0.0319 2.46 3.97e-8 32.82 
220200 34.9165 34.9770 0.1134 0.0011 0.0422 1.34 0.0233 2.43 2.36e-8 27.90 
220220 28.7290 28.7840 0.0819 9.04e-4 0.0338 1.21 0.0142 2.87 3.35e-8 24.41 


on u;. We can observe that the simulated values are very close to true ones. In a 


A 
on 


Vn 
and it is usually estimated as the realised standard deviation of the simulations 6, = 


particular way, the Standard Error e, = is a measure of simulation accurancy 


n i 3 2 n i ¢ 2 
Die (ew) - (Sas) divided by the square root of simulations. Moreover, 


to reduce the variance of results, we propose the Antithetic Variates (AV), the Stratified 
Sample with two intervals (ST) and a general stratified sample (GST). The Antithetic 
Variates consist in generating n independent pairwise averages 5 (g i (uj)+g i (1—w;)) 
withu; ~ U[O0, 1]. The function gi (1 —u;) decreases whenever gi (u;) increases, and 
this produces a negative covariance cov [gi (uj), gil —uj)] < 0 and so a variance 
reduction. For instance, we can rewrite the Monte Carlo pricing given by Equation 
(18) as: 

BAY te (Pry st ( gia) + def — 0) (30) 

ri §sUT n\s 78s i 78s i ; 

We can observe that the variance G2, is halved, but if we generate n = 500000 
uniform variates u and we also use the values of 1 — u, it results in a total of 1 000 000 
function evaluations. Therefore, in order to determine the efficiency Eff,,, the variance 
G6? should be compared with the same number (1 000000) of function evaluations. 
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We can conclude that efficiency Eff, = as and the introduction of antithetic 
variates has the same effect on precision as doubling the sample size of Monte Carlo 
path-simulations. 

Using the Stratified Sample, we concentrate the sample in the region where the 
function g is positive and, where the function is more variable, we use larger sam- 
ple sizes. First of all, we consider the piecewise ag'(u1) +d - a)gi(u2) where 
u, ~ U[O, a] and u2 ~ U[a, 1], as an individual sample. This is a weighted average 
of two function values with weights a and | — a proportional to the length of the 
corresponding intervals. If uw; and uz are independent, we obtain a dramatic improve- 
ment in variance reduction since it becomes a*var[gt (uy)J) +0 —- a)*var[gt (u2)]. 
For instance, the payoff of SEEO gi(P/.) = max[0, P/, — 1] with Vo = 180 and 
Do = 180 has a positive value starting from a; = 0.60, as shown in Figure 1(a), 
while the CEEO will be exercised when P;, > 0.9878 and the payoff will be positive 
from a, = 0.50, as illustrated in Figure 1(b). Assuming a = 0.90, Tables 1, 2 and 
3 show the variance using the Stratified Sample (ST) and the efficiency index. For 
the same reason as before, we should to compare this result with the Monte Carlo 
variance with the same number (1 000000) of path simulations. The efficiency index 


~2 
Effs; = “1 a shows that the improvement is about 4. We can assert that it is possibile 


to use one fourth the sample size by stratifying the sample into two regions: [0, a] 
and [a, 1]. 

Finally, we consider the general stratified sample subdividing the interval [0, 1] 
into convenient subintervals. Then, if we use the stratified method with two strata 
[0.80, 0.90], [0.90, 1], Tables 1, 2 and 3 show the variance and also the efficiency 


gain Effgs; = —— Moreover, for the first simulation of SEEO we have that 


Liat iF gst 
the optimal choice sample size isn = 66477, 433522, for the first simulation of 
CEEO we obtain that n = 59915, 440084, while for the PAEO it results that n = 


59 492, 440507. It’s plain that the functions gi are more variables in the the interval 


> 


ee 
oo 
is} 


2. 
a 
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Cumulative Normal Distribution 


Cumulative Normal Distribution -, 


° Asset Price P 0 0.9878 Asset Price P 


(a) SEEO (b) CEEO 


Fig. 1. Cumulative normal distribution of asset P 
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[0.90, 1] and so the sample size is about 440000. We can observe that this stratified 
sample can account for an improvement in efficiency of about 23. 


6 Conclusions 


In this paper we have shown a generalisation of the antithetic method and an a- 
sampling procedure to value exchange options improving on the Monte Carlo simu- 
lation. Using the delivery asset D as numeraire, we have reduced the bi-dimensionality 
of evaluation to one stochastic variable P that is the ratio between assets V and D. 
But the particular evolution of asset P requires a new sampling procedure to concen- 
trate the simulations in the range in which P is more sensitive in order to reduce the 
variance. The paper can be improved choosing a* in order to minimise the variance of 
simulation through an endogenic process. To realise this objective, a short simulation, 
to estimate some optimal a*, and then the a*-stratification, may be used. 


Acknowledgement. Many thanks to the anonymous reviewers for their constructive comments. 
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